Accelerating Digital Pathology Workflows Using cuCIM and NVIDIA GPUDirect Storage

Learn how GPU-accelerated toolkits improve the input/output performance and image processing tasks for digital pathology workflows.

Gregory Lee
9 min readintermediate
--
View Original

Overview

The article discusses how NVIDIA's cuCIM and GPUDirect Storage can significantly enhance digital pathology workflows by improving input/output performance and image processing tasks. It highlights the challenges of handling large whole slide images and presents various use cases demonstrating the benefits of GPU acceleration.

What You'll Learn

1

How to use GPU-accelerated toolkits for loading tiled data from disk directly to GPU memory

2

Why using Magnum IO GPUDirect Storage improves data transfer efficiency

3

How to implement a tiled image processing workflow using CUDA

Prerequisites & Requirements

  • Understanding of whole slide imaging and digital pathology concepts
  • Familiarity with CUDA and GPU programming(optional)

Key Questions Answered

What are the main challenges in processing whole slide images?
Processing whole slide images involves challenges such as handling large file sizes, requiring substantial preprocessing and postprocessing for deep learning applications. This includes tasks like artifact detection, color normalization, and image subsampling, which can be time-consuming.
How does Magnum IO GPUDirect Storage enhance data transfer?
Magnum IO GPUDirect Storage provides a direct data path for DMA transfers between GPU memory and storage, which increases bandwidth, reduces latency, and decreases CPU load. This leads to improved performance in data-intensive applications like digital pathology.
What performance improvements can be achieved with GDS?
Using GDS, the performance for reading tiled images improved from 2.0x to 2.7x in single-threaded scenarios and from 4.2x to 11.8x in parallel reads. This demonstrates significant acceleration in data handling for large images.
What are the benefits of using cuCIM for image processing?
cuCIM is an open-source library designed for accelerated computer vision and image processing, particularly for multidimensional images. It supports various applications in biomedical and life sciences, making it a valuable tool for digital pathology.

Key Statistics & Figures

Single-threaded acceleration without GDS
2.0x
Performance improvement when reading tiled images using kivikIO.
Single-threaded acceleration with GDS
2.7x
Performance improvement when GDS is enabled.
Parallel read acceleration without GDS
4.2x
Performance improvement for parallel reads using kivikIO.
Parallel read acceleration with GDS
11.8x
Performance improvement for parallel reads when GDS is enabled.

Technologies & Tools

Library
Cucim
Used for accelerated computer vision and image processing tasks.
Storage
Magnum Io Gpudirect Storage
Enhances data transfer efficiency between GPU memory and storage.
Framework
Cuda
Used for performing image processing tasks.
File Format
Zarr
Used for storing large datasets in a chunked format.

Key Actionable Insights

1
Implementing GPU-accelerated workflows can drastically reduce the time needed for processing large whole slide images.
By leveraging tools like cuCIM and GPUDirect Storage, organizations can improve the efficiency of their digital pathology processes, leading to faster diagnosis and treatment.
2
Utilizing tiled image processing can help manage memory usage effectively while maintaining performance.
This approach allows for processing large images in smaller segments, reducing the overall memory footprint and enabling more efficient use of GPU resources.
3
Incorporating deep learning into image analysis requires careful preprocessing to ensure accurate results.
Techniques such as color normalization and artifact detection are essential for improving the reliability of predictions in clinical settings.

Common Pitfalls

1
Not accounting for potential border artifacts during tilewise processing can lead to inaccurate results.
This issue arises because edge extensions of individual tiles may not represent actual neighboring data, affecting operations that rely on convolution.

Related Concepts

Digital Pathology
Whole Slide Imaging
GPU Acceleration In Medical Imaging
Deep Learning In Image Analysis