How NVIDIA Uses JAX
29 engineering articles about JAX from NVIDIA's engineering team
Other NVIDIA Technologies
Other Companies Using JAX
Articles
Filter:
The article discusses the integration of the NVSHMEM communication library into the Accelerated Linear Algebra (XLA) compiler to optimize long-context model training in JAX.
The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.
Kyle Aubrey
59 min read
Has Summary
--
The article discusses the NVIDIA ALCHEMI Toolkit-Ops, a specialized toolkit designed to accelerate AI-powered atomistic simulations in chemistry and materials science.
The article discusses Autodesk Research's development of the Accelerated Lattice Boltzmann (XLB) library, which enhances computational fluid dynamics (CFD) performance using NVIDIA's Warp and GH200...
The article discusses how NVIDIA's hardware innovations, particularly the Blackwell architecture and NVFP4 precision, along with their open source contributions, are driving advancements in AI.
George Chellapa
8 min read
Has Summary
--
The article discusses the introduction of Wheel Variants, a new Python packaging standard aimed at improving the installation and packaging workflows for CUDA-accelerated Python packages.
The article discusses techniques for optimizing low-latency communication in inference workloads using JAX and XLA, particularly focusing on the decode phase of large language models (LLMs).
NVIDIA cuQuantum is an SDK designed to accelerate quantum computing emulations significantly. The latest update, cuQuantum 25.
Tom Lubowe
4 min read
Includes Code
Has Summary
--
NVIDIA has announced world-record inference performance for the DeepSeek-R1 model using the Blackwell architecture, achieving over 250 tokens per second per user and a maximum throughput of over 30...
Ashraf Eassa
13 min read
Has Summary
--
The article discusses the introduction of Gemma 3, a range of lightweight, multimodal, and multilingual models optimized for performance in AI applications.
New research from the University of Washington demonstrates how deep learning can enhance AI weather models, allowing for more accurate predictions and extending forecast capabilities up to 23 days...
Michelle Horton
3 min read
Has Summary
--
The article discusses how to build a zero-copy AI sensor processing pipeline using OpenCV within the NVIDIA Holoscan SDK.
Meiran Peng
7 min read
Includes Code
Has Summary
--
The article discusses the enhancements made in NVIDIA's cuDNN 9 library, focusing on the acceleration of Transformers through the implementation of Scaled Dot Product Attention (SDPA).
Matthew Nicely
11 min read
Includes Code
Has Summary
--
The article discusses NVIDIA cuQuantum 23. 10, an SDK designed to accelerate quantum circuit simulations using NVIDIA Tensor Core GPUs.
The article discusses the latest features of the NVIDIA NeMo framework and the performance enhancements brought by the NVIDIA H200 GPUs, which significantly improve the training of large language m...
The article discusses how NVIDIA Holoscan is being utilized to accelerate ptychography workflows at the Diamond Light Source, a leading synchrotron facility.
This article provides a comprehensive guide on deploying AI models in Python using the PyTriton interface with NVIDIA Triton Inference Server.
Shankar Chandrasekaran
6 min read
Includes Code
Has Summary
--
The article discusses how to efficiently scale large language model (LLM) training across a large GPU cluster using the open-source frameworks Alpa and Ray.
Jiao Dong
14 min read
Includes Code
Has Summary
--
The article discusses RAPIDS RAFT, a library designed to optimize machine learning and data analytics on GPUs by providing reusable computational patterns.
Corey Nolet
11 min read
Includes Code
Has Summary
--
The article discusses the use of NVIDIA BioNeMo Service for building generative AI pipelines aimed at drug discovery.
Vanessa Braunstein
8 min read
Has Summary
--
The article discusses the increasing computational demands for AI processing at the edge and introduces the NVIDIA Holoscan SDK v0.
The article discusses NVIDIA's BioNeMo service, a framework for training and serving biomolecular large language models (LLMs) designed for predicting protein structures and properties.
Vanessa Braunstein
3 min read
Has Summary
--
NVIDIA has announced significant updates to its AI software suite, including JAX, NVIDIA CV-CUDA, and NVIDIA RAPIDS, aimed at accelerating AI research, computer vision, and data science.
ApacheApache SparkComputer VisionDaskDeep LearningDGLGoogle CloudGPTJAXKubernetesNeural NetworksNumPyPyTorchPyTorch GeometricSQL
Siddharth Sharma
7 min read
Has Summary
--
The article discusses the improved interoperability between NVIDIA Vision Programming Interface (VPI) and PyTorch, focusing on how VPI can enhance object detection and tracking in computer vision a...
The article introduces NVIDIA Warp, a Python framework designed for writing differentiable graphics and physics simulations on the GPU.
Machine Learning Frameworks Interoperability, Part 3: Zero-Copy in Action using an E2E Pipeline
This article discusses the implementation of an end-to-end pipeline utilizing zero-copy techniques for efficient data transfer across various machine learning frameworks.
Christian Hundt
7 min read
Has Summary
--
The article discusses the significance of tensor methods in modern machine learning, particularly their application in NVIDIA's AI algorithms.
Jean Kossaifi
4 min read
Has Summary
--
This article discusses the importance of efficient memory layouts and memory pools in machine learning frameworks to enhance interoperability and performance.
Christian Hundt
9 min read
Includes Code
Has Summary
--
The article discusses cuCIM, a new RAPIDS library designed for accelerated n-dimensional image processing and image I/O on GPUs.
AlbumentationsApacheDaskDeep LearningITKJavaJAXNumbaNumPyOpenCVPythonPyTorchscikit-imageSciPySimpleITK
Gigon Bae
6 min read
Includes Code
Has Summary
--
You've reached the end! All 29 articles loaded.