JAX Programming Tutorials & Engineering Articles

135 JAX tutorials, guides, and engineering insights from Google and NVIDIA

Companies Using This

Google(102)

NVIDIA(29)

JAX Articles & Tutorials

Filter:

NVIDIA

Advanced

Accelerating Long-Context Model Training in JAX and XLA

The article discusses the integration of the NVSHMEM communication library into the Accelerated Linear Algebra (XLA) compiler to optimize long-context model training in JAX.

DockerJAXPython

Sevin Fide Varoglu

9 min read

Includes Code

Has Summary

Google

Advanced

Easy FunctionGemma finetuning with Tunix on Google TPUs

This tutorial demonstrates how to fine-tune FunctionGemma, a small language model for translating natural language into API calls, using Google's Tunix library on TPUs.

Hugging FaceJAXLarge Language Models

Wei Wei

4 min read

Includes Code

Has Summary

Google

Advanced

LiteRT: The Universal Framework for On-Device AI

LiteRT has evolved from its TensorFlow Lite foundation into a universal on-device AI inference framework, now offering production-ready GPU acceleration across six platforms and streamlined NPU int...

GeminiHugging FaceJAXPyTorchTensorFlow

Lu Wang, Chintan Parikh, Jingjiang Li, Terry Heo

9 min read

Includes Code

Has Summary

Google

Intermediate

A Guide to Fine-Tuning FunctionGemma

This article demonstrates how to fine-tune FunctionGemma, a specialized 270M parameter Gemma 3 model designed for function calling in agentic AI systems.

Fine-tuningHugging FaceJAXJSONShell

Juyeong Ji

5 min read

Includes Code

Has Summary

NVIDIA

Advanced

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.

AssemblyHugging FaceJAXKubernetesLessPyTorchRLHFTransformer

Kyle Aubrey

59 min read

Has Summary

Google

Advanced

A Developer's Guide to Debugging JAX on Cloud TPUs: Essential Tools and Techniques

This article serves as a practical guide for developers working with JAX on Cloud TPUs, focusing on essential tools and techniques for debugging and profiling machine learning workflows.

JAXShell

Zhenzhen (Jen) Tan, Brian Kang, Ashish Narasimham

5 min read

Includes Code

Has Summary

NVIDIA

Advanced

Accelerating AI-Powered Chemistry and Materials Science Simulations with NVIDIA ALCHEMI Toolkit-Ops

The article discusses the NVIDIA ALCHEMI Toolkit-Ops, a specialized toolkit designed to accelerate AI-powered atomistic simulations in chemistry and materials science.

JAXPythonPyTorchWarp

Justin S. Smith

10 min read

Includes Code

Has Summary

Google

Beginner

Introducing A2UI: An open project for agent-driven interfaces

The article introduces A2UI, an open-source project designed for agent-driven interfaces that allows agents to generate contextually relevant user interfaces.

AngularApacheDartGeminiGenerative AIHTMLJavaScriptJAXJSONLitReactShellVercel

Google A2UI Team

13 min read

Includes Code

Has Summary

Google

Intermediate

MediaTek NPU and LiteRT: Powering the next generation of on-device AI

The article discusses the advancements in on-device AI powered by MediaTek's Neural Processing Unit (NPU) and the introduction of the LiteRT NeuroPilot Accelerator.

GeminiJavaJAXKotlinRetrieval Augmented Generation

Lu Wang, Arian Arfaian, Luke Boyer

10 min read

Includes Code

Has Summary

Google

Intermediate

Unlocking Peak Performance on Qualcomm NPU with LiteRT

The article discusses optimizing performance on Qualcomm's Neural Processing Unit (NPU) using LiteRT, Google's high-performance on-device ML framework.

JAXKotlinShell

Lu Wang, Weiyi Wang, Andrew Zhang

9 min read

Includes Code

Has Summary

Google

Advanced

Building production AI on Google Cloud TPUs with JAX

The article discusses the JAX AI Stack, a modular framework for building production AI models on Google Cloud TPUs.

FlaxGoogle CloudJAX

Rakesh Iyer, Srikanth Kilaru

6 min read

Includes Code

Has Summary

Google

Advanced

Introducing Metrax: performant, efficient, and robust model evaluation metrics in JAX

Metrax is a high-performance library designed for efficient and robust model evaluation metrics in JAX, addressing the need for standardized metrics during the migration from TensorFlow.

FlaxJAXTensorFlow

Yufeng Guo, Jiwon Shin, Jeff Carpenter

5 min read

Includes Code

Has Summary

Google

Beginner

Agent Garden - Samples for learning, discovering and building

Agent Garden is a platform designed to facilitate the development and deployment of AI agents, making it accessible to all users, not just those on Google Cloud.

FirebaseGoogle CloudJAXVertex AI

Kanchana Patlolla, Tamás Mágedli

3 min read

Has Summary

Google

Intermediate

Introducing Coral NPU: A full-stack platform for Edge AI

The article introduces Coral NPU, a full-stack, open-source platform designed to enhance Edge AI capabilities on low-power devices.

Generative AIJAXPyTorchTensorFlow

Billy Rutledge

8 min read

Has Summary

Google

Advanced

Building High-Performance Data Pipelines with Grain and ArrayRecord

The article discusses building high-performance data pipelines using Grain, a data loading library for JAX, and ArrayRecord, an efficient file format.

ApacheGoogle CloudGoogle Cloud StorageJAXShellTensorFlow

Jiyang Kang, Shivaji Dutta, Ihor Indyk, Felix Chern

10 min read

Includes Code

Has Summary

Google

Advanced

Introducing Tunix: A JAX-Native Library for LLM Post-Training

The article introduces Tunix, a new open-source, JAX-native library designed for post-training of large language models (LLMs).

FlaxGoogle CloudJAXReinforcement LearningRLHF

Srikanth Kilaru, Tianshu Bao

7 min read

Includes Code

Has Summary

Anthropic

Intermediate

A postmortem of three recent issues

The article provides a detailed postmortem of three infrastructure bugs that affected the response quality of Claude between August and early September.

Amazon BedrockAWSClaudeGoogle CloudJAXVertex AI

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

Autodesk Research Brings Warp Speed to Computational Fluid Dynamics on NVIDIA GH200

The article discusses Autodesk Research's development of the Accelerated Lattice Boltzmann (XLB) library, which enhances computational fluid dynamics (CFD) performance using NVIDIA's Warp and GH200...

FortranJAXNumbaNumPyPythonPyTorchWarp

Mehdi Ataei

7 min read

Has Summary

Google

Advanced

Beyond backpropagation: JAX's symbolic power unlocks new frontiers in scientific computing

The article discusses how JAX, a popular framework for AI model development, is being increasingly adopted in scientific computing, particularly for solving complex Partial Differential Equations (...

Deep LearningJAX

Srikanth Kilaru, Zekun Shi, Min Lin

6 min read

Has Summary

NVIDIA

Advanced

NVIDIA Hardware Innovations and Open Source Contributions Are Shaping AI

The article discusses how NVIDIA's hardware innovations, particularly the Blackwell architecture and NVFP4 precision, along with their open source contributions, are driving advancements in AI.

GPTHugging FaceJAXKubernetesPythonPyTorchTransformer

George Chellapa

8 min read

Has Summary

Google

Intermediate

Introducing Gemma 3 270M: The compact model for hyper-efficient AI

The article introduces Gemma 3 270M, a compact AI model designed for hyper-efficient task-specific fine-tuning.

DockerGoogle CloudHugging FaceJAXKerasOllamaTransformersVertex AI

Olivier Lacombe, Kathleen Kenealy, Kat Black, Ravin Kumar, Francesco Visin, Jiageng Zhang

5 min read

Has Summary

NVIDIA

Intermediate

Streamline CUDA-Accelerated Python Install and Packaging Workflows with Wheel Variants

The article discusses the introduction of Wheel Variants, a new Python packaging standard aimed at improving the installation and packaging workflows for CUDA-accelerated Python packages.

DockerJAXPythonPyTorchSciPy

Jonathan Dekhtiar

15 min read

Includes Code

Has Summary

Google

Advanced

Train a GPT2 model with JAX on TPU for free

This article provides a comprehensive guide on how to train a GPT-2 model using JAX on TPU, highlighting the ease of leveraging Google TPUs for free.

FlaxGPTJAXLarge Language ModelsMulti-Head AttentionPyTorchTensorFlow

Wei Wei

8 min read

Includes Code

Has Summary

Google

Advanced

A roboticist's journey with JAX: Finding efficiency in optimal control and simulation

The article discusses the increasing adoption of JAX in robotics, highlighting its efficiency in optimal control and simulation. It features insights from Max Muchen Sun, a Robotics Ph. D.

FlaxJAXNumPyPyTorch

Srikanth Kilaru, Max Muchen Sun

6 min read

Has Summary

NVIDIA

Advanced

Optimizing for Low-Latency Communication in Inference Workloads with JAX and XLA

The article discusses techniques for optimizing low-latency communication in inference workloads using JAX and XLA, particularly focusing on the decode phase of large language models (LLMs).

JAXPython

Jaya Shankar

6 min read

Includes Code

Has Summary

Google

Advanced

Stanford’s Marin foundation model: The first fully open model developed using JAX

Stanford's Marin project introduces the first fully open foundation model developed using JAX, emphasizing transparency in the scientific process behind AI models.

ApacheFlaxGoogle CloudHugging FaceJAX

Srikanth Kilaru, David Hall

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA cuQuantum Adds Dynamics Gradients, DMRG, and Simulation Speedup

NVIDIA cuQuantum is an SDK designed to accelerate quantum computing emulations significantly. The latest update, cuQuantum 25.

JAX

Tom Lubowe

4 min read

Includes Code

Has Summary

Google

Intermediate

Using KerasHub for easy end-to-end machine learning workflows with Hugging Face

The article discusses how to use KerasHub for loading model weights from SafeTensors into Keras, enabling flexible end-to-end machine learning workflows across different frameworks like JAX, PyTorc...

Hugging FaceJAXKerasMistralPyTorchTensorFlow

Yufeng Guo, Divyashree Sreepathihalli, Monica Song

8 min read

Includes Code

Has Summary

Google

Advanced

Build and train a recommender system in 10 minutes using Keras and JAX

The article introduces Keras Recommenders, a new library designed to simplify the creation of state-of-the-art recommendation systems using Keras with JAX, TensorFlow, or PyTorch.

EmbeddingGRUJAXKerasPyTorchTensorFlow

Yufeng Guo, Monica Song

3 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance

NVIDIA has announced world-record inference performance for the DeepSeek-R1 model using the Blackwell architecture, achieving over 250 tokens per second per user and a maximum throughput of over 30...

CLIPHugging FaceJAXOllamaPythonPyTorchT5TensorFlowTransformer

Ashraf Eassa

13 min read

Has Summary

NVIDIA

Intermediate

Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance

The article discusses the introduction of Gemma 3, a range of lightweight, multimodal, and multilingual models optimized for performance in AI applications.

JAXLangChainPython

Anu Srivastava

3 min read

Includes Code

Has Summary

Google

Intermediate

Introducing Gemma 3: The Developer Guide

Gemma 3 is the latest version of the Gemma open-model family, boasting enhanced capabilities such as multimodality, longer context windows, and improved reasoning.

Hugging FaceJAXOllamaReinforcement LearningRLHFTransformersVertex AI

Omar Sanseviero, Philipp Schmid

5 min read

Includes Code

Has Summary

Google

Beginner

Safer and Multimodal: Responsible AI with Gemma

The article discusses the launch of ShieldGemma 2, a safety content classifier model built on Gemma 3, aimed at detecting harmful content in both synthetic and natural images.

Hugging FaceJAXKerasOllamaTransformers

Dana Kurniawan, Wenjun Zeng, Ryan Mullins

3 min read

Has Summary

Google

Beginner

Introducing PaliGemma 2 mix: A vision-language model for multiple tasks

PaliGemma 2 mix is an advanced vision-language model designed for multiple tasks, allowing developers to utilize a single model for various applications such as image captioning, object detection, ...

Hugging FaceJAXKerasPyTorchTransformers

Omar Sanseviero, Andreas Steiner

3 min read

Includes Code

Has Summary

Google

Intermediate

Introducing PaliGemma 2: Powerful Vision-Language Models, Simple Fine-Tuning

PaliGemma 2 is the latest vision-language model from Google, designed to simplify the process of building advanced AI that can interpret visual inputs.

Hugging FaceJAXKerasPyTorchTransformers

Daniel Keysers, Andreas Steiner

3 min read

Has Summary

Google

Intermediate

Farewell and thank you for the continued partnership, Francois Chollet!

The article announces that Francois Chollet, the creator of Keras, is leaving Google to pursue new opportunities.

JAXKerasPyTorchTensorFlow

Bill Jia, Xavi Amatriain

2 min read

Has Summary

NVIDIA

Intermediate

AI Accurately Forecasts Extreme Weather Up to 23 Days Ahead

New research from the University of Washington demonstrates how deep learning can enhance AI weather models, allowing for more accurate predictions and extending forecast capabilities up to 23 days...

JAX

Michelle Horton

3 min read

Has Summary

Google

Intermediate

Introducing Keras Hub: Your one-stop shop for pretrained models

The article introduces Keras Hub, a unified library for pretrained models that simplifies access to both natural language processing (NLP) and computer vision (CV) architectures.

BERTGeminiJAXKerasPILPyTorchShellStable Diffusion

Divyashree Sreepathihalli, Luciano Martins

7 min read

Includes Code

Has Summary

Google

Intermediate

Gemma explained: PaliGemma architecture

The article discusses the PaliGemma architecture, a lightweight open vision-language model (VLM) inspired by PaLI-3.

EmbeddingFine-tuningJAXKeras

Ju-yeong Ji, Ravin Kumar

6 min read

Includes Code

Has Summary

Google

Intermediate

TensorFlow Lite is now LiteRT

LiteRT, formerly known as TensorFlow Lite, is a high-performance runtime for on-device AI that now supports models from multiple frameworks including PyTorch, JAX, and Keras.

JAXKerasPyTorchTensorFlow

Google AI Edge team

4 min read

Has Summary

Google

Intermediate

Gemma explained: What’s new in Gemma 2

The article discusses the release of Gemma 2, a new suite of open models that sets a new standard for performance and accessibility in conversational AI.

EmbeddingFine-tuningGoogle CloudGPTHugging FaceJAXKeras

Ju-yeong Ji, Ravin Kumar

5 min read

Includes Code

Has Summary

Google

Intermediate

Smaller, Safer, More Transparent: Advancing Responsible AI with Gemma

The article discusses the advancements in responsible AI through the introduction of Gemma 2, which includes models with 27 billion and 9 billion parameters, emphasizing safety and accessibility.

Generative AIGoogle CloudGPTHugging FaceJAXKerasKubernetesOllamaVertex AI

Neel Nanda, Tom Lieberum, Ludovic Peran, Kathleen Kenealy

6 min read

Has Summary

Google

Intermediate

Fine-tuning Gemma 2 with Keras - and an update from Hugging Face

The article discusses the release of the Gemma 2 model with 27 billion parameters, highlighting its capabilities in Keras and integration with JAX for efficient model training.

Fine-tuningHugging FaceJAXKerasPyTorchTensorFlowTransformerTransformers

Martin Görner

5 min read

Includes Code

Has Summary

Google

Intermediate

Model Explorer: Simplifying ML models for Edge devices

Model Explorer is a powerful graph visualization tool designed to simplify the development and optimization of machine learning models for edge devices.

ChiGeminiJAXPyTorchTensorFlowtorchvision

Kristen Wright, Eric Yang

6 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Build a Zero-Copy AI Sensor Processing Pipeline with OpenCV in NVIDIA Holoscan SDK

The article discusses how to build a zero-copy AI sensor processing pipeline using OpenCV within the NVIDIA Holoscan SDK.

Computer VisionDeep LearningJAXNumbaNumPyOpenCVPythonPyTorchTensorFlow

Meiran Peng

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Accelerating Transformers with NVIDIA cuDNN 9

The article discusses the enhancements made in NVIDIA's cuDNN 9 library, focusing on the acceleration of Transformers through the implementation of Scaled Dot Product Attention (SDPA).

JAXPythonPyTorchTensorFlowTransformerTransformers

Matthew Nicely

11 min read

Includes Code

Has Summary

Google

Intermediate

Google I/O 2024 recap: Making AI accessible and helpful for every developer

The article recaps the Google I/O 2024 event, highlighting advancements in AI technologies aimed at making AI accessible for developers.

CachingDartFirebaseGeminiGenerative AIGoogle CloudJAXKerasKotlinOllamaPostgreSQLPyTorchTensorFlowWebAssembly

Jeanine Banks

8 min read

Has Summary

Google

Intermediate

Introducing PaliGemma, Gemma 2, and an Upgraded Responsible AI Toolkit

The article introduces PaliGemma, an open vision-language model, along with Gemma 2, the next generation of the Gemma models, and updates to the Responsible AI Toolkit.

GeminiGenerative AIGoogle CloudHugging FaceJAXKerasTransformersVertex AI

Tris Warkentin, Xiaohua Zhai, Ludovic Peran

4 min read

Has Summary

Google

Intermediate

Publish your Keras models on Kaggle and Hugging Face

This article discusses how to publish Keras models on Kaggle and Hugging Face, highlighting the ease of sharing fine-tuned models with the community.

DiffusersHugging FaceJAXKerasLarge Language ModelsPyTorchTensorFlowTransformers

Martin Görner

4 min read

Includes Code