T5 Programming Tutorials &amp; Engineering Articles

Measuring AI’s capability to accelerate biological research in the wet lab

Advanced

The article discusses how AI, specifically GPT-5, can enhance biological research in wet labs by optimizing molecular cloning protocols, achieving a 79-fold increase in efficiency.

AssemblyGPTRailsT5

OpenAI Team

16 min read

Has Summary

Intermediate

LLM-Powered Relevance Assessment for Pinterest Search

The article discusses the implementation of LLM-powered relevance assessment at Pinterest Search, focusing on how fine-tuned large language models (LLMs) can enhance search relevance measurement wh...

BERTBLIPMachine LearningRoBERTaT5

Pinterest Engineering

9 min read

Has Summary

Google

Intermediate

T5Gemma: A new collection of encoder-decoder Gemma models

The article introduces T5Gemma, a new collection of encoder-decoder models derived from pretrained decoder-only models.

Hugging FaceT5TransformerVertex AI

Biao Zhang, Paul Suganthan, Ben Hora

5 min read

Has Summary

Optimizing FLUX.1 Kontext for Image Editing with Low-Precision Quantization

Advanced

The article discusses the optimization of the FLUX. 1 Kontext model for image editing through low-precision quantization techniques.

CLIPT5Transformer

Sandro Cavallari

9 min read

Includes Code

Has Summary

NVIDIA TensorRT for RTX Introduces an Optimized Inference AI Library on Windows 11

Advanced

NVIDIA TensorRT for RTX is a newly announced optimized inference AI library designed for Windows 11, enhancing performance for AI applications on NVIDIA RTX GPUs.

PyTorchResNetT5

Gunjan Mehta

8 min read

Has Summary

NVIDIA TensorRT Unlocks FP4 Image Generation for NVIDIA Blackwell GeForce RTX 50 Series GPUs

Intermediate

The article discusses the advancements brought by NVIDIA's TensorRT in enabling FP4 image generation for the Blackwell GeForce RTX 50 Series GPUs.

CLIPPyTorchT5Transformer

Gunjan Mehta

10 min read

Has Summary

Advancing Invoice Document Processing at Uber using GenAI

Intermediate

The article discusses how Uber has advanced its invoice document processing by implementing a GenAI-powered automation system.

GPTGPT-4T5

Rohit Subudhi, Rakesh Vagvala, Sushil Kumar Jain Devichand, Indrani Bose, Balaram Baral

13 min read

Has Summary

Advanced

Improving Pinterest Search Relevance Using Large Language Models

The article discusses the implementation of a Large Language Model (LLM)-based relevance system for Pinterest Search, detailing its technical design, model architecture, and the results from both o...

BERTBLIPHugging FaceLarge Language ModelsMachine LearningRoBERTaSupervised LearningT5

Pinterest Engineering

7 min read

Has Summary

NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance

Advanced

NVIDIA has announced world-record inference performance for the DeepSeek-R1 model using the Blackwell architecture, achieving over 250 tokens per second per user and a maximum throughput of over 30...

CLIPHugging FaceJAXOllamaPythonPyTorchT5TensorFlowTransformer

Ashraf Eassa

13 min read

Has Summary

NVIDIA TensorRT-LLM Now Accelerates Encoder-Decoder Models with In-Flight Batching

Advanced

NVIDIA TensorRT-LLM has expanded its capabilities to accelerate encoder-decoder model architectures, enhancing inference performance for various generative AI applications on NVIDIA GPUs.

PrometheusT5

Anjali Shah

4 min read

Has Summary

Google

Intermediate

Gemma explained: An overview of Gemma model family architectures

The article provides an overview of the Gemma model family architectures, detailing its lightweight, state-of-the-art open models derived from Gemini research.

BERTEmbeddingGeminiGPTHugging FaceKerasT5TransformerTransformers

Ju-yeong Ji, Ravin Kumar

9 min read

Includes Code

Has Summary

Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities

Advanced

The article discusses the new functionalities of NVIDIA Megatron-Core, an open-source library designed to enhance the efficiency of training generative AI models.

AWSAzureBERTCLIPGeminiGenerative AIGPTHugging FaceMistralPythonPyTorchT5

Erin Ho

10 min read

Includes Code

Has Summary

Addressing Hallucinations in Speech Synthesis LLMs with the NVIDIA NeMo T5-TTS Model

Intermediate

The article discusses the NVIDIA NeMo T5-TTS model, a significant advancement in text-to-speech (TTS) technology that addresses hallucinations in speech synthesis using large language models (LLMs).

Subhankar Ghosh

4 min read

Has Summary

DragonCrawl: Generative AI for High-Quality Mobile Testing

Advanced

The article discusses DragonCrawl, a generative AI system developed by Uber to enhance mobile testing by mimicking human-like interactions with applications.

EmbeddingGenerative AIGPTLarge Language ModelsMachine LearningRoBERTaT5Transformer

Juan Marcano, Mengdie Zhang, Ali Zamani, Anam Hira

18 min read

Has Summary

Train Generative AI Models for Drug Discovery with NVIDIA BioNeMo Framework

Intermediate

The NVIDIA BioNeMo Framework is a newly released platform that enables researchers to build and deploy generative AI models for drug discovery.

BERTGenerative AIT5

Harry Clifford

6 min read

Has Summary

Streamline Generative AI Development with NVIDIA NeMo on GPU-Accelerated Google Cloud

Advanced

The article discusses how NVIDIA NeMo can streamline the development of generative AI applications on GPU-accelerated Google Cloud.

BERTDaskFine-tuningGenerative AIGoogle CloudGPTHugging FacePythonRedisReinforcement LearningT5Transformer

Chintan Patel

9 min read

Has Summary

Announcing Cadence 1.0: The Powerful Workflow Platform Built for Scale and Reliability

Advanced

Cadence 1. 0 is a powerful open-source workflow orchestration platform designed for building and managing stateful services at scale.

ApacheCassandraElasticsearchGrafanagRPCJavaMySQLOracleT5

Ender Demirkaya

10 min read

Has Summary

Efficiently Scale LLM Training Across a Large GPU Cluster with Alpa and Ray

Advanced

The article discusses how to efficiently scale large language model (LLM) training across a large GPU cluster using the open-source frameworks Alpa and Ray.

AWSBERTChatGPTDALL-EGenerative AIGPTJAXPythonRoBERTaStable DiffusionT5TensorFlow

Jiao Dong

14 min read

Includes Code

Has Summary

Increasing Inference Acceleration of KoGPT with NVIDIA FasterTransformer

Intermediate

The article discusses the optimization of Kakao Brain's KoGPT large language model using NVIDIA FasterTransformer, highlighting the significant improvements in inference speed and performance.

BERTGPTPyTorchT5TensorFlowTransformerTransformersV

Daemyung Jang

5 min read

Has Summary

NVIDIA Announces Generative AI Services for Language, Visual Content, and Biology Applications

Intermediate

NVIDIA has introduced generative AI services aimed at enhancing language, visual content, and biology applications.

BERTCLIPDeep LearningGenerative AIGPTLarge Language ModelsNatural Language ProcessingRLHFStable DiffusionT5

Annamalai Chockalingam

5 min read

Has Summary

Airbnb

Advanced

How AI Text Generation Models Are Reshaping Customer Support at Airbnb

The article discusses how Airbnb leverages AI text generation models to enhance customer support, focusing on their capabilities, benefits, and specific use cases like content recommendation, real-...

Artificial IntelligenceT5TransformersUnsupervised Learning

Gavin Li

12 min read

Has Summary

New on NGC: SDKs for Large Language Models, Digital Twins, Digital Biology, and More

Intermediate

The article discusses the latest SDKs available in the NGC catalog, focusing on tools for Large Language Models (LLMs), digital twins, and digital biology.

AzureGPTLarge Language ModelsOraclePyTorchT5TensorFlowTransformer

Chintan Patel

5 min read

Has Summary

Solving AI Inference Challenges with NVIDIA Triton

Advanced

The article discusses the challenges of deploying AI models in production and how NVIDIA Triton Inference Server addresses these challenges.

AWSBERTGPTKubernetesLightGBMPythonPyTorchscikit-learnSHAPT5TensorFlowTransformerXGBoost

Shankar Chandrasekaran

11 min read

Includes Code

Has Summary

Simplifying Access to Large Language Models with the NVIDIA NeMo Framework and Services

Intermediate

The article discusses NVIDIA's efforts to simplify access to large language models (LLMs) through the NeMo framework and associated services, including NeMo LLM and BioNeMo.

AzureGPTLarge Language ModelsOracleT5

Annamalai Chockalingam

4 min read

Has Summary

Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server

Advanced

The article discusses the NVIDIA Triton Inference Server and its FasterTransformer library, which enables accelerated inference for large transformer models.

BERTGPTJSONPyTorchT5TensorFlowTransformerTransformers

Denis Timonin

9 min read

Includes Code

Has Summary

Deploying GPT-J and T5 with NVIDIA Triton Inference Server

Advanced

This article provides a comprehensive guide on deploying large transformer models like GPT-J and T5 using NVIDIA's Triton Inference Server and FasterTransformer library.

BERTDockerGPTHugging FaceNeural NetworksPythonPyTorchT5TensorFlowTransformer

Denis Timonin

15 min read

Includes Code

Has Summary

Build Speech AI in Multiple Languages and Train Large Language Models with the Latest from Riva and NeMo

Intermediate

The article discusses major updates to NVIDIA's Riva SDK for building speech AI applications and the NeMo framework for training large language models.

AzureLarge Language ModelsSpringT5

Siddharth Sharma

3 min read

Has Summary

Major Updates to NVIDIA AI Software Advancing Speech, Recommenders, Inference, and More Announced at NVIDIA

Intermediate

At GTC 2022, NVIDIA unveiled significant updates to its AI software suite, focusing on advancements in speech AI, recommenders, and inference optimization. The updates include the launch of Riva 2.

AWSAzureDeep LearningKubernetesLarge Language ModelsT5

Siddharth Sharma

5 min read

Has Summary

NVIDIA Announces TensorRT 8.2 and Integrations with PyTorch and TensorFlow

Advanced

NVIDIA has released TensorRT 8. 2, which includes optimizations for billion parameter Natural Language Understanding (NLU) models like T5 and GPT-2, enabling real-time applications.

Deep LearningGPTPythonPyTorchT5TensorFlow

Jay Rodge

2 min read

Has Summary

Optimizing T5 and GPT-2 for Real-Time Inference with NVIDIA TensorRT

Advanced

This article discusses the optimization of T5 and GPT-2 models for real-time inference using NVIDIA TensorRT.

BERTDockerGPTHugging FaceKerasMATLABPyTorchT5TensorFlowTransfer LearningTransformerV

Vinh Nguyen

8 min read

Includes Code

Has Summary

ICYMI: New AI Tools and Technologies Announced at NVIDIA GTC Keynote

Intermediate

At NVIDIA GTC, new AI tools and technologies were announced, including NVIDIA Riva for speech applications, TensorRT 8. 2 for deep learning inference, and NVIDIA Triton Inference Server 2.

AzureDeep LearningGoogle CloudGPTHugging FaceKubernetesPythonPyTorchT5TensorFlowTransformerTransformers

Siddharth Sharma

5 min read

Has Summary

TruthfulQA: Measuring how models mimic human falsehoods

Intermediate

The article discusses the TruthfulQA benchmark, which evaluates the truthfulness of language models in generating answers to questions.

GPTT5

Stephanie Lin

2 min read

Has Summary

Applying Natural Language Processing Across the World’s Languages

Advanced

The article discusses the advancements and challenges in applying Natural Language Processing (NLP) across various languages, emphasizing the need for large-scale models and the engineering efforts...

BERTDeep LearningGPTNatural Language ProcessingNeural NetworksT5TransformersYAML

Adam Grzywaczewski

14 min read

Has Summary

Ludwig v0.3 Introduces Hyperparameter Optimization, Transformers and TensorFlow 2 support

Intermediate

Ludwig version 0. 3 introduces significant enhancements, including hyperparameter optimization, support for Transformers, and integration with TensorFlow 2.

ApacheAutoMLBERTFiberGPTHugging FaceJSONPandasT5TensorFlowTransformerTransformers

Kerri Brown, Piero Molino, Yaroslav Dudin

10 min read

Has Summary

Learning to summarize with human feedback

Advanced

The article discusses the application of reinforcement learning from human feedback to enhance the summarization capabilities of language models.

Fine-tuningGPTT5Transformers

Nisan Stiennon

16 min read

Has Summary