#

TensorFlow Programming Tutorials & Engineering Articles

606 TensorFlow tutorials, guides, and engineering insights from NVIDIA, Google, Uber, and more

Companies Using This

TensorFlow Articles & Tutorials

Filter:

Advanced

LiteRT: The Universal Framework for On-Device AI

LiteRT has evolved from its TensorFlow Lite foundation into a universal on-device AI inference framework, now offering production-ready GPU acceleration across six platforms and streamlined NPU int...

GeminiHugging FaceJAXPyTorchTensorFlow

Lu Wang, Chintan Parikh, Jingjiang Li, Terry Heo

9 min read

Includes Code

Has Summary

--

Intermediate

Tangle: An open-source ML experimentation platform built at Shopify scale

Shopify open-sources Tangle, an ML experimentation platform built to solve six common failure modes in machine learning development.

DockerJavaJavaScriptRubyRustShellSQLiteTensorFlowXGBoostYAML

Shopify Engineering

12 min read

Has Summary

--

Advanced

Introducing Metrax: performant, efficient, and robust model evaluation metrics in JAX

Metrax is a high-performance library designed for efficient and robust model evaluation metrics in JAX, addressing the need for standardized metrics during the migration from TensorFlow.

FlaxJAXTensorFlow

Yufeng Guo, Jiwon Shin, Jeff Carpenter

5 min read

Includes Code

Has Summary

--

Advanced

A Decade of AI Platform at Pinterest

The article reflects on a decade of AI platform development at Pinterest, detailing the evolution from fragmented machine learning stacks to a unified AI platform that supports various models.

AutoMLDockerEmbeddingGenerative AIJavaKubernetesLightGBMPySparkPythonPyTorchSeedSQLTensorFlowThriftTransformer

Pinterest Engineering

22 min read

Has Summary

--

Advanced

Enabling Deep Model Explainability with Integrated Gradients at Uber

This article discusses how Uber has integrated explainability into its machine learning platform, Michelangelo, using Integrated Gradients (IG) to provide interpretable attributions for deep learni...

EmbeddingKerasLIMEMachine LearningPyTorchSHAPTensorFlowXGBoostYAML

Hugh Chen, Eric Wang, Gaoyuan Huang, Howard Yu, Jia Li, Sally Lee

14 min read

Has Summary

--

Intermediate

Introducing Coral NPU: A full-stack platform for Edge AI

The article introduces Coral NPU, a full-stack, open-source platform designed to enhance Edge AI capabilities on low-power devices.

Generative AIJAXPyTorchTensorFlow

Billy Rutledge

8 min read

Has Summary

--

Advanced

Building High-Performance Data Pipelines with Grain and ArrayRecord

The article discusses building high-performance data pipelines using Grain, a data loading library for JAX, and ArrayRecord, an efficient file format.

ApacheGoogle CloudGoogle Cloud StorageJAXShellTensorFlow

Jiyang Kang, Shivaji Dutta, Ihor Indyk, Felix Chern

10 min read

Includes Code

Has Summary

--

Intermediate

Why CVEs Belong in Frameworks and Apps, Not AI Models

The article discusses the relevance of the Common Vulnerabilities and Exposures (CVE) system in relation to AI models, arguing that CVEs should be focused on the frameworks and applications that ut...

Rich Harang

7 min read

Has Summary

--

Intermediate

How we built AI face cropping for Images

The article discusses the development of AI face cropping technology by Cloudflare, which automatically crops images around detected faces.

PyTorchRustTensorFlowYOLO

Deanna Lam

14 min read

Includes Code

Has Summary

--

Advanced

Train a GPT2 model with JAX on TPU for free

This article provides a comprehensive guide on how to train a GPT-2 model using JAX on TPU, highlighting the ease of leveraging Google TPUs for free.

FlaxGPTJAXLarge Language ModelsMulti-Head AttentionPyTorchTensorFlow

Wei Wei

8 min read

Includes Code

Has Summary

--

Intermediate

Driving AI-Powered Robotics Development with NVIDIA Isaac for Healthcare

The article discusses the impending shortage of healthcare workers and how AI-enabled robotic systems, powered by NVIDIA Isaac for Healthcare, can address these challenges.

PyTorchTensorFlow

Ansley Dunn

6 min read

Includes Code

Has Summary

--

Advanced

Delivering the Missing Building Blocks for NVIDIA CUDA Kernel Fusion in Python

The article discusses the introduction of cuda-cccl, a Python library that provides high-level building blocks for NVIDIA CUDA kernel fusion, enabling developers to write efficient algorithms witho...

LessPythonPyTorchTensorFlowXGBoost

Ashwin Srinath

5 min read

Includes Code

Has Summary

--

Intermediate

Using KerasHub for easy end-to-end machine learning workflows with Hugging Face

The article discusses how to use KerasHub for loading model weights from SafeTensors into Keras, enabling flexible end-to-end machine learning workflows across different frameworks like JAX, PyTorc...

Hugging FaceJAXKerasMistralPyTorchTensorFlow

Yufeng Guo, Divyashree Sreepathihalli, Monica Song

8 min read

Includes Code

Has Summary

--

Intermediate

The Future of React Native Graphics: WebGPU, Skia, and Beyond

The article discusses the advancements in React Native graphics through the integration of WebGPU and Skia, highlighting how these technologies enhance performance and enable new capabilities for d...

FiberJavaScriptMachine LearningNode.jsReactTensorFlow

William Candillon

8 min read

Has Summary

--

Advanced

Build and train a recommender system in 10 minutes using Keras and JAX

The article introduces Keras Recommenders, a new library designed to simplify the creation of state-of-the-art recommendation systems using Keras with JAX, TensorFlow, or PyTorch.

EmbeddingGRUJAXKerasPyTorchTensorFlow

Yufeng Guo, Monica Song

3 min read

Includes Code

Has Summary

--

Advanced

Accelerate Deep Learning and LLM Inference with Apache Spark in the Cloud

The article discusses how to accelerate Deep Learning (DL) and Large Language Model (LLM) inference using Apache Spark in cloud environments.

ApacheApache SparkAWSAzureDeep LearningDockerJSONNumPyPythonPyTorchSemantic SearchTensorFlowTransformers

Rishi Chandra

9 min read

Includes Code

Has Summary

--

Advanced

Optimizing Transformer-Based Diffusion Models for Video Generation with NVIDIA TensorRT

The article discusses optimizing transformer-based diffusion models for video generation using NVIDIA TensorRT, highlighting significant reductions in latency and total cost of ownership (TCO) achi...

AWSDeep LearningDiffusion ModelsPyTorchTensorFlowTransformer

Maximilian Müller

7 min read

Has Summary

--

Intermediate

AI Advances Parkinson’s Detection Using Standard MRI Scans

A new AI-powered tool developed by researchers at the University of Florida and medical centers aims to improve the diagnosis of Parkinson's disease using standard MRI scans.

Michelle Horton

3 min read

Has Summary

--

Advanced

NVIDIA Open Sources Run:ai Scheduler to Foster Community Collaboration

NVIDIA has open-sourced the KAI Scheduler, a Kubernetes-native GPU scheduling solution under the Apache 2. 0 license, originally developed for the Run:ai platform.

ApacheKubernetesPyTorchTensorFlow

Ronen Dar

9 min read

Has Summary

--

Advanced

NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models

NVIDIA Dynamo is a newly released low-latency distributed inference framework designed to enhance the deployment of generative AI and reasoning models in large-scale environments.

OraclePyTorchTensorFlow

Amr Elmeleegy

12 min read

Has Summary

--

Advanced

NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance

NVIDIA has announced world-record inference performance for the DeepSeek-R1 model using the Blackwell architecture, achieving over 250 tokens per second per user and a maximum throughput of over 30...

CLIPHugging FaceJAXOllamaPythonPyTorchT5TensorFlowTransformer

Ashraf Eassa

13 min read

Has Summary

--

Advanced

Spinal Health Diagnostics Gets Deep Learning Automation

The article discusses an advanced deep-learning model designed to automate X-ray analysis for spinal health diagnostics, enhancing speed and accuracy in assessing conditions like scoliosis and kyph...

Deep LearningTensorFlowU-Net

Michelle Horton

4 min read

Has Summary

--

Advanced

Five Takeaways from NVIDIA 6G Developer Day 2024

The article discusses key insights from the NVIDIA 6G Developer Day 2024, highlighting the integration of AI into 6G infrastructure and the significance of AI-RAN.

PythonPyTorchTensorFlow

Emeka Obiodu

10 min read

Has Summary

--

Advanced

Liger-Kernel: Empowering an open source ecosystem of Triton Kernels for Efficient LLM Training

The article discusses Liger-Kernel, an open-source library designed to enhance GPU efficiency for training large language models (LLMs).

EmbeddingHugging FaceKubernetesLLaMAPythonPyTorchTensorFlow

Pin-Lun (Byron) Hsu

10 min read

Includes Code

Has Summary

--

Intermediate

Farewell and thank you for the continued partnership, Francois Chollet!

The article announces that Francois Chollet, the creator of Keras, is leaving Google to pursue new opportunities.

JAXKerasPyTorchTensorFlow

Bill Jia, Xavi Amatriain

2 min read

Has Summary

--

Advanced

Web AI Summit 2024 Recap: Client-Side AI for Developers

The Web AI Summit 2024, hosted by Google on October 18, 2024, focused on client-side AI for developers, showcasing how machine learning models can operate offline in web browsers.

Hugging FaceJavaScriptJSONLangChainMachine LearningTensorFlowTransformersWebAssembly

Jason Mayes

10 min read

Has Summary

--

Intermediate

Enhanced Security and Streamlined Deployment of AI Agents with NVIDIA AI Enterprise

The article discusses the advancements in AI agents facilitated by NVIDIA AI Enterprise, emphasizing enhanced security, streamlined deployment, and management of AI pipelines.

DGLEmbeddingGoogle CloudHelmKubernetesMistralPythonPyTorchTensorFlow

Charu Chaubal

5 min read

Has Summary

--

Advanced

Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes

The article discusses how to scale Large Language Models (LLMs) using NVIDIA Triton and NVIDIA TensorRT-LLM in a Kubernetes environment.

AWSAzureDockerGenerative AIGPTGrafanaHelmHugging FaceKubernetesNGINXPrometheusPythonPyTorchTensorFlowTraefik

Maggie Zhang

16 min read

Includes Code

Has Summary

--

Advanced

Treating Brain Disease with Brain-Machine Interactive Neuromodulation and NVIDIA Jetson

The article discusses the Brain-Machine Interactive Neuromodulation Research Tool (BMINT), which utilizes closed-loop neuromodulation techniques to treat brain diseases like epilepsy and Parkinson'...

Shouyan Wang

4 min read

Has Summary

--

Advanced

Ray Batch Inference at Pinterest (Part 3)

This article discusses the implementation of Ray Batch Inference at Pinterest, highlighting its advantages over previous solutions like Apache Spark and Torch Dataloader.

ApacheApache SparkAWSHugging FaceLarge Language ModelsLLaMAPyTorchRay TuneTensorFlow

Pinterest Engineering

11 min read

Includes Code

Has Summary

--

Advanced

Real-Time Neural Receivers Drive AI-RAN Innovation

The article discusses the development and deployment of real-time neural receivers (NRX) in 5G New Radio (5G NR) systems, highlighting their potential to enhance wireless communication through AI-d...

Artificial IntelligenceDeep LearningTensorFlow

Sebastian Cammerer

10 min read

Has Summary

--

Intermediate

TensorFlow Lite is now LiteRT

LiteRT, formerly known as TensorFlow Lite, is a high-performance runtime for on-device AI that now supports models from multiple frameworks including PyTorch, JAX, and Keras.

JAXKerasPyTorchTensorFlow

Google AI Edge team

4 min read

Has Summary

--

Intermediate

New Foundational Models and Training Capabilities with NVIDIA TAO 5.5

The article discusses the release of NVIDIA TAO 5. 5, a framework that simplifies AI model development and deployment.

AutoMLBERTCLIPModalPyTorchResNetTensorFlowTransformerTransformers

Monika Jhuria

12 min read

Includes Code

Has Summary

--

Intermediate

NVIDIA Triton Inference Server Achieves Outstanding Performance in MLPerf Inference 4.1 Benchmarks

The article discusses the impressive performance of the NVIDIA Triton Inference Server in the MLPerf Inference v4.

AWSAWS SageMakerAzureDeep LearninggRPCKubernetesOraclePythonPyTorchSNAPTensorFlowVertex AIYAML

Amr Elmeleegy

8 min read

Includes Code

Has Summary

--

Intermediate

Streamlining LLM Inference at the Edge with TFLite

The article discusses optimizing Large Language Model (LLM) inference at the edge using TensorFlow Lite (TFLite) and XNNPack.

Stable DiffusionTensorFlow

Quentin Khan, Linkun Chen

6 min read

Includes Code

Has Summary

--

Advanced

Building Spatial Intelligence from Real-World 3D Data Using Deep-Learning Framework fVDB

The article discusses NVIDIA's fVDB, a deep-learning framework designed to build spatial intelligence from real-world 3D data.

Generative AIPyTorchTensorFlowWarp

Ken Museth

6 min read

Has Summary

--

Intermediate

Fine-tuning Gemma 2 with Keras - and an update from Hugging Face

The article discusses the release of the Gemma 2 model with 27 billion parameters, highlighting its capabilities in Keras and integration with JAX for efficient model training.

Fine-tuningHugging FaceJAXKerasPyTorchTensorFlowTransformerTransformers

Martin Görner

5 min read

Includes Code

Has Summary

--

Intermediate

Model Explorer: Simplifying ML models for Edge devices

Model Explorer is a powerful graph visualization tool designed to simplify the development and optimization of machine learning models for edge devices.

ChiGeminiJAXPyTorchTensorFlowtorchvision

Kristen Wright, Eric Yang

6 min read

Includes Code

Has Summary

--

Advanced

Real-Time Vision AI From Digital Twins to Cloud-Native Deployment with NVIDIA Metropolis

The article discusses NVIDIA Metropolis, a platform for real-time vision AI that streamlines deployment through microservices and workflows.

AWSAzureDockerElasticsearchMicroservicesPyTorchResNetTensorFlowTransformerWebRTC

Monika Jhuria

11 min read

Has Summary

--

Intermediate

Build a Zero-Copy AI Sensor Processing Pipeline with OpenCV in NVIDIA Holoscan SDK

The article discusses how to build a zero-copy AI sensor processing pipeline using OpenCV within the NVIDIA Holoscan SDK.

Computer VisionDeep LearningJAXNumbaNumPyOpenCVPythonPyTorchTensorFlow

Meiran Peng

7 min read

Includes Code

Has Summary

--

Advanced

AI Edge Torch Generative API for Custom LLMs on Device

The article introduces the AI Edge Torch Generative API, designed to enable developers to create high-performance LLMs in PyTorch for deployment on edge devices using the TensorFlow Lite runtime.

EmbeddingPyTorchTensorFlow

Cormac Brick, Haoliang Zhang

10 min read

Includes Code

Has Summary

--

Intermediate

Accelerating Transformers with NVIDIA cuDNN 9

The article discusses the enhancements made in NVIDIA's cuDNN 9 library, focusing on the acceleration of Transformers through the implementation of Scaled Dot Product Attention (SDPA).

JAXPythonPyTorchTensorFlowTransformerTransformers

Matthew Nicely

11 min read

Includes Code

Has Summary

--

Intermediate

Enhancing the Apparel Shopping Experience with AI, Emoji-Aware OCR, and Snapchat’s Screenshop

The article discusses how Snap's ML engineering team enhanced the apparel shopping experience using AI, specifically through the Screenshop service integrated into Snapchat.

DockerEmbeddingKubernetesPrometheusPythonPyTorchTensorFlow

Amr Elmeleegy

7 min read

Has Summary

--

Beginner

AI Edge Torch: High Performance Inference of PyTorch Models on Mobile Devices

Google AI Edge Torch provides a seamless integration from PyTorch to TensorFlow Lite (TFLite), enhancing model coverage and CPU performance for mobile devices.

KerasPyTorchTensorFlowtorchvision

Cormac Brick, Advait Jain, Haoliang Zhang

5 min read

Includes Code

Has Summary

--

Intermediate

Google I/O 2024 recap: Making AI accessible and helpful for every developer

The article recaps the Google I/O 2024 event, highlighting advancements in AI technologies aimed at making AI accessible for developers.

CachingDartFirebaseGeminiGenerative AIGoogle CloudJAXKerasKotlinOllamaPostgreSQLPyTorchTensorFlowWebAssembly

Jeanine Banks

8 min read

Has Summary

--

Advanced

Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints

The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) pipeline using NVIDIA AI LangChain AI Endpoints.

EmbeddingGenerative AIgRPCHTMLJavaLangChainPythonPyTorchRetrieval Augmented GenerationTensorFlow

Amit Bleiweiss

13 min read

Includes Code

Has Summary

--

Intermediate

Publish your Keras models on Kaggle and Hugging Face

This article discusses how to publish Keras models on Kaggle and Hugging Face, highlighting the ease of sharing fine-tuned models with the community.

DiffusersHugging FaceJAXKerasLarge Language ModelsPyTorchTensorFlowTransformers

Martin Görner

4 min read

Includes Code

Has Summary

--

Advanced

From Predictive to Generative – How Michelangelo Accelerates Uber’s AI Journey

The article discusses Uber's evolution in machine learning (ML) through its centralized platform, Michelangelo, highlighting its transition from predictive to generative AI.

ApacheApache SparkAutoMLDeep LearningDockerGenerative AIHugging FaceKerasKubernetesPaLMPrompt EngineeringPyTorchTensorFlowXGBoost

Kai Wang, Min Cai, Joseph Wang, Eric Chen

28 min read

Has Summary

--

Intermediate

Powering Mission-Critical AI at the Edge with NVIDIA AI Enterprise IGX

The article discusses NVIDIA AI Enterprise IGX, a software solution designed for mission-critical AI applications at the edge.

PyTorchTensorFlow

Suhas Hariharapura Sheshadri

5 min read

Has Summary

--

Intermediate

Accelerating the Future of Wireless Communication with the NVIDIA 6G Developer Program

The article discusses the NVIDIA 6G Developer Program, which aims to accelerate the development of 6G technology by providing access to AI/ML tools, simulation environments, and software-defined ne...

PythonPyTorchTensorFlow

Kuntal Chowdhury

9 min read

Has Summary

--