#

Transformer Programming Tutorials & Engineering Articles

271 Transformer tutorials, guides, and engineering insights from NVIDIA, OpenAI, Uber, and more

Transformer Articles & Tutorials

Filter:
NVIDIA logo
NVIDIA
Advanced
The article discusses how NVIDIA's hardware-software co-design significantly enhanced the inference performance of Sarvam AI's Sovereign 30B model, achieving a 4x speedup on NVIDIA Blackwell archit...
Utkarsh Uppal
14 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how NVFP4, a low-precision floating-point format developed by NVIDIA, enhances AI training and inference performance.
Ashraf Eassa
6 min read
Has Summary
--
Pinterest logo
Pinterest
Intermediate
The article discusses how Pinterest enhances its ad candidate generation process using behavioral sequence modeling.
Pinterest Engineering
10 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article introduces Dynamic Context Parallelism (Dynamic-CP), a scheduling approach in NVIDIA Megatron Core designed to optimize training for variable-length sequences in large-scale models.
Kunlun Li
11 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the limitations of current large language models (LLMs) in handling long contexts and introduces Test-Time Training with an end-to-end formulation (TTT-E2E) as a solution.
NVIDIA logo
NVIDIA
Advanced
The article discusses the NVIDIA Rubin platform, which introduces six new chips designed to create a powerful AI supercomputer.
NVIDIA logo
NVIDIA
Advanced
This article provides a comprehensive tutorial on building a voice agent using NVIDIA's Nemotron models, focusing on retrieval-augmented generation (RAG) and safety guardrails.
Chris Alexiuk
8 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the NVIDIA Nemotron 3, a family of open models designed for agentic AI systems, emphasizing its efficiency and accuracy through innovative architectures and techniques.
OpenAI logo
OpenAI
Advanced
The article discusses Mirakl's vision for agentic commerce, emphasizing the integration of AI across the company to enhance workflows and product offerings.
OpenAI Team
4 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses model quantization, a technique essential for deploying complex AI models on resource-constrained hardware.
Ruixiang Wang
11 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The NVIDIA Blackwell architecture has achieved the fastest training times across all MLPerf Training v5. 1 benchmarks, showcasing significant advancements in AI training performance.
NVIDIA logo
NVIDIA
Advanced
The article discusses how NVIDIA's NeMo Automodel simplifies the training of large-scale mixture-of-experts (MoE) models in PyTorch, making it accessible to a broader audience.
Hemil Desai
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how to scale biology transformer models using PyTorch and NVIDIA BioNeMo Recipes, focusing on advanced parallel computing techniques and the integration of the NVIDIA Transfor...
Kyle Tretina
6 min read
Includes Code
Has Summary
--
Pinterest logo
Pinterest
Advanced
The article reflects on a decade of AI platform development at Pinterest, detailing the evolution from fragmented machine learning stacks to a unified AI platform that supports various models.
NVIDIA logo
NVIDIA
Advanced
The article introduces CodonFM, a new state-of-the-art RNA foundation model developed by NVIDIA as part of the Clara open model family.
Kyle Gion
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the optimization of large language models (LLMs) through pruning and knowledge distillation using NVIDIA TensorRT Model Optimizer.
Max Xu
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how id Software integrated RTX neural rendering and path tracing into DOOM: The Dark Ages, highlighting the advancements in real-time graphics and the technical challenges ove...
Phillip Singh
6 min read
Has Summary
--
Google logo
Google
Intermediate
The article provides an in-depth exploration of the EmbeddingGemma architecture, detailing its origins, embedding generation process, and the comprehensive training methodology.
Henrique Schechter Vera, Juyeong Ji, Sahil Dua
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses three neural innovations from NVIDIA Research that are enhancing robot learning capabilities, specifically focusing on bridging the gap between controlled simulations and real...
Rishabh Chadha
8 min read
Has Summary
--
Google logo
Google
Advanced
The article discusses the deployment of on-device generative AI (GenAI) using LiteRT-LM in Chrome, Chromebook Plus, and Pixel Watch.
Yu-hui Chen, Ram Iyengar
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses the advantages of using FP8 precision for faster training throughput in large-scale deep learning models with NVIDIA NeMo.
Karin Sevegnani
11 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses ReaSyn, a generative model developed by NVIDIA to predict molecular synthesis pathways, addressing the challenges of synthesizability in molecular design.
Seul Lee
6 min read
Has Summary
--
NVIDIA logo
NVIDIA
Beginner
The article discusses how the NVIDIA HGX B200 significantly reduces embodied carbon emissions intensity compared to its predecessor, the HGX H100, while enhancing performance and energy efficiency.
Zoe Kessler
4 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article introduces speculative decoding as a technique to reduce latency in AI inference, particularly for large language models (LLMs).
Jamie Li
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the release of two new open-source models, Qwen3-Next 80B-A3B-Thinking and Qwen3-Next 80B-A3B-Instruct, which utilize a hybrid Mixture of Experts (MoE) architecture to enhance...
Anu Srivastava
4 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses fine-tuning the gpt-oss model for improved accuracy and performance through Quantization Aware Training (QAT) and Supervised Fine-Tuning (SFT).
Eduardo Alvarez
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses NVIDIA's NVFP4, a new 4-bit precision format for training large language models (LLMs) that enhances efficiency and scalability while maintaining accuracy.
Kirthi Devleker
9 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article introduces the NVIDIA Jetson Thor, a powerful platform designed for physical AI and humanoid robotics.
Shashank Maheshwari
13 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the NVIDIA Blackwell Ultra GPU, a significant advancement in the Blackwell architecture designed to enhance AI training and reasoning capabilities.
Kyle Aubrey
13 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how NVIDIA's hardware innovations, particularly the Blackwell architecture and NVFP4 precision, along with their open source contributions, are driving advancements in AI.
George Chellapa
8 min read
Has Summary
--
Uber logo
Uber
Advanced
This article discusses the development and implementation of forecasting models aimed at improving driver availability at airports, which are critical to Uber's ridesharing ecosystem.
Bob Zheng, Dhruv Ghulati, Manoj Panikkar, Michael (Yichuan) Cai
15 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the evolving landscape of AI security, focusing on how hackers exploit the problem-solving instincts of multimodal AI systems through cognitive challenges.
Daniel Teixeira
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
NVIDIA has optimized OpenAI's gpt-oss models for accelerated inference performance on the NVIDIA GB200 NVL72 system, achieving up to 1. 5 million tokens per second (TPS).
Anu Srivastava
6 min read
Includes Code
Has Summary
--
OpenAI logo
OpenAI
Advanced
The article introduces gpt-oss, two state-of-the-art open-weight language models, gpt-oss-120b and gpt-oss-20b, which excel in reasoning tasks and are optimized for deployment on consumer hardware.
NVIDIA logo
NVIDIA
Intermediate
The article discusses the advancements in multilingual human-like speech synthesis and voice cloning using NVIDIA Riva TTS.
Maggie Zhang
9 min read
Has Summary
--
Google logo
Google
Intermediate
The article introduces T5Gemma, a new collection of encoder-decoder models derived from pretrained decoder-only models.
Biao Zhang, Paul Suganthan, Ben Hora
5 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the optimization of the FLUX. 1 Kontext model for image editing through low-precision quantization techniques.
Sandro Cavallari
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses FP8 scaling strategies, including per-tensor and per-block scaling, essential for maintaining numerical stability and accuracy during low-precision training.
Karin Sevegnani
9 min read
Includes Code
Has Summary
--
Google logo
Google
Intermediate
The article introduces Gemma 3n, a mobile-first architecture designed for on-device AI, highlighting its multimodal capabilities and architectural innovations.
Omar Sanseviero, Ian Ballantyne
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses advancements in AI-based 3D robot perception and mapping, focusing on NVIDIA's research efforts to create a unified 3D perception stack.
Raffaello Bonghi
12 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses NVIDIA's advancements in molecular AI modeling through the introduction of cuEquivariance and NIM microservices, which enhance the speed and efficiency of training and inferen...
Neha Tadimeti
8 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in AI autonomy through NVIDIA's Nemotron open reasoning models, which enhance AI agents' decision-making capabilities in complex environments.
Nirmal Kumar Juluru
6 min read
Has Summary
--
Pinterest logo
Pinterest
Intermediate
This article discusses how Pinterest enhances its recommendation system through the TransActV2 model, which leverages over 16,000 lifelong user actions to improve personalization.
Pinterest Engineering
8 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article introduces the Nemotron-H Reasoning Model Family developed by NVIDIA, which addresses the challenges of reasoning-intensive tasks in large language models by significantly improving thr...
Adi Renduchintala
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the performance improvements delivered by NVIDIA's Blackwell architecture in MLPerf Training v5. 0, showcasing up to 2.
Sukru Burc Eryilmaz
12 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the advancements in AI training through the introduction of floating-point 8 (FP8) precision, emphasizing its benefits in computational efficiency and memory usage.
Karin Sevegnani
10 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in large language models (LLMs) focusing on the importance of extended context lengths for processing and generating text.
Amit Bleiweiss
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses advanced optimization strategies for training large language models (LLMs) on the NVIDIA Grace Hopper Superchip.
Karin Sevegnani
9 min read
Includes Code
Has Summary
--
LinkedIn logo
LinkedIn
Advanced
The article discusses JUDE, LinkedIn's platform for generating high-quality embeddings for job recommendations using fine-tuned Large Language Models (LLMs).
Netflix logo
Netflix
Intermediate
FM-Intent is a novel recommendation model developed by Netflix that enhances user session intent prediction through hierarchical multi-task learning.
Netflix Technology Blog
9 min read
Has Summary
--