#
JAX Programming Tutorials & Engineering Articles
135 JAX tutorials, guides, and engineering insights from Google and NVIDIA
Companies Using This
JAX Articles & Tutorials
Filter:
The article discusses the integration of the NVSHMEM communication library into the Accelerated Linear Algebra (XLA) compiler to optimize long-context model training in JAX.
This tutorial demonstrates how to fine-tune FunctionGemma, a small language model for translating natural language into API calls, using Google's Tunix library on TPUs.
Wei Wei
4 min read
Includes Code
Has Summary
--
LiteRT has evolved from its TensorFlow Lite foundation into a universal on-device AI inference framework, now offering production-ready GPU acceleration across six platforms and streamlined NPU int...
Lu Wang, Chintan Parikh, Jingjiang Li, Terry Heo
9 min read
Includes Code
Has Summary
--
This article demonstrates how to fine-tune FunctionGemma, a specialized 270M parameter Gemma 3 model designed for function calling in agentic AI systems.
Juyeong Ji
5 min read
Includes Code
Has Summary
--
The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.
Kyle Aubrey
59 min read
Has Summary
--
This article serves as a practical guide for developers working with JAX on Cloud TPUs, focusing on essential tools and techniques for debugging and profiling machine learning workflows.
The article discusses the NVIDIA ALCHEMI Toolkit-Ops, a specialized toolkit designed to accelerate AI-powered atomistic simulations in chemistry and materials science.
The article introduces A2UI, an open-source project designed for agent-driven interfaces that allows agents to generate contextually relevant user interfaces.
The article discusses the advancements in on-device AI powered by MediaTek's Neural Processing Unit (NPU) and the introduction of the LiteRT NeuroPilot Accelerator.
Lu Wang, Arian Arfaian, Luke Boyer
10 min read
Includes Code
Has Summary
--
The article discusses optimizing performance on Qualcomm's Neural Processing Unit (NPU) using LiteRT, Google's high-performance on-device ML framework.
The article discusses the JAX AI Stack, a modular framework for building production AI models on Google Cloud TPUs.
Rakesh Iyer, Srikanth Kilaru
6 min read
Includes Code
Has Summary
--
Metrax is a high-performance library designed for efficient and robust model evaluation metrics in JAX, addressing the need for standardized metrics during the migration from TensorFlow.
Yufeng Guo, Jiwon Shin, Jeff Carpenter
5 min read
Includes Code
Has Summary
--
Agent Garden is a platform designed to facilitate the development and deployment of AI agents, making it accessible to all users, not just those on Google Cloud.
Kanchana Patlolla, Tamás Mágedli
3 min read
Has Summary
--
The article introduces Coral NPU, a full-stack, open-source platform designed to enhance Edge AI capabilities on low-power devices.
Billy Rutledge
8 min read
Has Summary
--
The article discusses building high-performance data pipelines using Grain, a data loading library for JAX, and ArrayRecord, an efficient file format.
Jiyang Kang, Shivaji Dutta, Ihor Indyk, Felix Chern
10 min read
Includes Code
Has Summary
--
The article introduces Tunix, a new open-source, JAX-native library designed for post-training of large language models (LLMs).
Srikanth Kilaru, Tianshu Bao
7 min read
Includes Code
Has Summary
--
The article provides a detailed postmortem of three infrastructure bugs that affected the response quality of Claude between August and early September.
10 min read
Includes Code
Has Summary
--
The article discusses Autodesk Research's development of the Accelerated Lattice Boltzmann (XLB) library, which enhances computational fluid dynamics (CFD) performance using NVIDIA's Warp and GH200...
The article discusses how JAX, a popular framework for AI model development, is being increasingly adopted in scientific computing, particularly for solving complex Partial Differential Equations (...
Srikanth Kilaru, Zekun Shi, Min Lin
6 min read
Has Summary
--
The article discusses how NVIDIA's hardware innovations, particularly the Blackwell architecture and NVFP4 precision, along with their open source contributions, are driving advancements in AI.
George Chellapa
8 min read
Has Summary
--
The article introduces Gemma 3 270M, a compact AI model designed for hyper-efficient task-specific fine-tuning.
Olivier Lacombe, Kathleen Kenealy, Kat Black, Ravin Kumar, Francesco Visin, Jiageng Zhang
5 min read
Has Summary
--
The article discusses the introduction of Wheel Variants, a new Python packaging standard aimed at improving the installation and packaging workflows for CUDA-accelerated Python packages.
This article provides a comprehensive guide on how to train a GPT-2 model using JAX on TPU, highlighting the ease of leveraging Google TPUs for free.
Wei Wei
8 min read
Includes Code
Has Summary
--
The article discusses the increasing adoption of JAX in robotics, highlighting its efficiency in optimal control and simulation. It features insights from Max Muchen Sun, a Robotics Ph. D.
The article discusses techniques for optimizing low-latency communication in inference workloads using JAX and XLA, particularly focusing on the decode phase of large language models (LLMs).
Stanford's Marin project introduces the first fully open foundation model developed using JAX, emphasizing transparency in the scientific process behind AI models.
Srikanth Kilaru, David Hall
8 min read
Includes Code
Has Summary
--
NVIDIA cuQuantum is an SDK designed to accelerate quantum computing emulations significantly. The latest update, cuQuantum 25.
Tom Lubowe
4 min read
Includes Code
Has Summary
--
The article discusses how to use KerasHub for loading model weights from SafeTensors into Keras, enabling flexible end-to-end machine learning workflows across different frameworks like JAX, PyTorc...
Yufeng Guo, Divyashree Sreepathihalli, Monica Song
8 min read
Includes Code
Has Summary
--
The article introduces Keras Recommenders, a new library designed to simplify the creation of state-of-the-art recommendation systems using Keras with JAX, TensorFlow, or PyTorch.
NVIDIA has announced world-record inference performance for the DeepSeek-R1 model using the Blackwell architecture, achieving over 250 tokens per second per user and a maximum throughput of over 30...
Ashraf Eassa
13 min read
Has Summary
--
The article discusses the introduction of Gemma 3, a range of lightweight, multimodal, and multilingual models optimized for performance in AI applications.
Gemma 3 is the latest version of the Gemma open-model family, boasting enhanced capabilities such as multimodality, longer context windows, and improved reasoning.
Omar Sanseviero, Philipp Schmid
5 min read
Includes Code
Has Summary
--
The article discusses the launch of ShieldGemma 2, a safety content classifier model built on Gemma 3, aimed at detecting harmful content in both synthetic and natural images.
Dana Kurniawan, Wenjun Zeng, Ryan Mullins
3 min read
Has Summary
--
PaliGemma 2 mix is an advanced vision-language model designed for multiple tasks, allowing developers to utilize a single model for various applications such as image captioning, object detection, ...
Omar Sanseviero, Andreas Steiner
3 min read
Includes Code
Has Summary
--
PaliGemma 2 is the latest vision-language model from Google, designed to simplify the process of building advanced AI that can interpret visual inputs.
Daniel Keysers, Andreas Steiner
3 min read
Has Summary
--
The article announces that Francois Chollet, the creator of Keras, is leaving Google to pursue new opportunities.
Bill Jia, Xavi Amatriain
2 min read
Has Summary
--
New research from the University of Washington demonstrates how deep learning can enhance AI weather models, allowing for more accurate predictions and extending forecast capabilities up to 23 days...
Michelle Horton
3 min read
Has Summary
--
The article introduces Keras Hub, a unified library for pretrained models that simplifies access to both natural language processing (NLP) and computer vision (CV) architectures.
The article discusses the PaliGemma architecture, a lightweight open vision-language model (VLM) inspired by PaLI-3.
Ju-yeong Ji, Ravin Kumar
6 min read
Includes Code
Has Summary
--
LiteRT, formerly known as TensorFlow Lite, is a high-performance runtime for on-device AI that now supports models from multiple frameworks including PyTorch, JAX, and Keras.
Google AI Edge team
4 min read
Has Summary
--
The article discusses the release of Gemma 2, a new suite of open models that sets a new standard for performance and accessibility in conversational AI.
Ju-yeong Ji, Ravin Kumar
5 min read
Includes Code
Has Summary
--
The article discusses the advancements in responsible AI through the introduction of Gemma 2, which includes models with 27 billion and 9 billion parameters, emphasizing safety and accessibility.
Neel Nanda, Tom Lieberum, Ludovic Peran, Kathleen Kenealy
6 min read
Has Summary
--
The article discusses the release of the Gemma 2 model with 27 billion parameters, highlighting its capabilities in Keras and integration with JAX for efficient model training.
Martin Görner
5 min read
Includes Code
Has Summary
--
Model Explorer is a powerful graph visualization tool designed to simplify the development and optimization of machine learning models for edge devices.
Kristen Wright, Eric Yang
6 min read
Includes Code
Has Summary
--
The article discusses how to build a zero-copy AI sensor processing pipeline using OpenCV within the NVIDIA Holoscan SDK.
Meiran Peng
7 min read
Includes Code
Has Summary
--
The article discusses the enhancements made in NVIDIA's cuDNN 9 library, focusing on the acceleration of Transformers through the implementation of Scaled Dot Product Attention (SDPA).
Matthew Nicely
11 min read
Includes Code
Has Summary
--
The article recaps the Google I/O 2024 event, highlighting advancements in AI technologies aimed at making AI accessible for developers.
CachingDartFirebaseGeminiGenerative AIGoogle CloudJAXKerasKotlinOllamaPostgreSQLPyTorchTensorFlowWebAssembly
Jeanine Banks
8 min read
Has Summary
--
The article introduces PaliGemma, an open vision-language model, along with Gemma 2, the next generation of the Gemma models, and updates to the Responsible AI Toolkit.
Tris Warkentin, Xiaohua Zhai, Ludovic Peran
4 min read
Has Summary
--
This article discusses how to publish Keras models on Kaggle and Hugging Face, highlighting the ease of sharing fine-tuned models with the community.
Martin Görner
4 min read
Includes Code
Has Summary
--
The article reveals the program lineup for Google I/O, highlighting key sessions and events for developers.