Leyuan Wang, a Ph.D. student in the UC Davis Department of Computer Science, presented one of only two “Distinguished Papers” of the 51 accepted at Euro-Par…
Overview
The article discusses cutting-edge research on parallel algorithms using CUDA, highlighting Leyuan Wang's work on suffix array construction algorithms optimized for NVIDIA GPUs. It details significant speedups achieved over existing methods and explores applications in data compression and graph processing.
What You'll Learn
How to implement high-performance string processing algorithms using CUDA
Why prefix-doubling suffix array construction is more efficient on GPUs than skew-based methods
When to use hybrid algorithms for suffix array construction to achieve optimal performance
Prerequisites & Requirements
- Understanding of parallel programming concepts and GPU architecture
- Familiarity with CUDA programming and GPU libraries like CUDPP and Gunrock(optional)
Key Questions Answered
What are the performance improvements achieved with the new suffix array algorithms?
How does the Burrows-Wheeler transform relate to suffix arrays?
What challenges are faced when implementing GPU algorithms?
What is the significance of the suffix array in various applications?
Key Statistics & Figures
Technologies & Tools
Key Actionable Insights
1Leverage hybrid algorithms for suffix array construction to maximize performance on GPUs.Utilizing a combination of skew and prefix-doubling methods can lead to substantial speed improvements, particularly for large datasets. This approach is beneficial in applications requiring rapid data processing.
2Focus on optimizing memory access patterns to enhance GPU performance.By ensuring memory accesses are coalesced and contiguous, developers can significantly increase the effective memory bandwidth, which is critical for achieving high performance in GPU applications.
3Explore the use of high-performance libraries like CUDPP and Gunrock for GPU programming.These libraries provide optimized primitives that can simplify the implementation of complex algorithms, allowing researchers and developers to focus on higher-level design rather than low-level optimizations.