How NVIDIA Uses Embedding

76 engineering articles about Embedding from NVIDIA's engineering team

Other NVIDIA Technologies

Python(740)PyTorch(566)Deep Learning(505)TensorFlow(444)Docker(292)Kubernetes(251)

Other Companies Using Embedding

Articles

Filter:

NVIDIA

Advanced

Build with Kimi K2.5 Multimodal VLM Using NVIDIA GPU-Accelerated Endpoints

Kimi K2. 5 is an advanced multimodal vision language model (VLM) developed by Kimi, optimized for various AI tasks.

EmbeddingFine-tuningHugging FacePyTorch

Anu Srivastava

4 min read

Includes Code

Has Summary

NVIDIA

Advanced

How to Build a Document Processing Pipeline for RAG with Nemotron

The article provides a comprehensive guide on building a document processing pipeline using NVIDIA Nemotron RAG, focusing on the extraction of structured data from complex documents like PDFs.

DockerEmbeddingHugging FaceJSONPythonRedistorchvision

Chia-Chih Chen

9 min read

Includes Code

Has Summary

NVIDIA

Advanced

Scaling NVFP4 Inference for FLUX.2 on NVIDIA Blackwell Data Center GPUs

The article discusses the collaboration between NVIDIA and Black Forest Labs to optimize the FLUX. 2 text-to-image model for NVIDIA Blackwell Data Center GPUs.

CachingEmbeddingMistral

Sandro Cavallari

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Multi-Agent Warehouse AI Command Layer Enables Operational Excellence and Supply Chain Intelligence

The article discusses the NVIDIA Multi-Agent Intelligent Warehouse (MAIW), an AI command layer designed to enhance operational efficiency and supply chain intelligence in automated warehouses.

DockerEmbeddingFastAPIGrafanaHelmJSONJWTOptunaPostgreSQLPrometheusReactRedisSQLTimescaleDB

Tarik Hammadou

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

How to Build a Voice Agent with RAG and Safety Guardrails

This article provides a comprehensive tutorial on building a voice agent using NVIDIA's Nemotron models, focusing on retrieval-augmented generation (RAG) and safety guardrails.

EmbeddingHugging FacePythonTransformerTransformers

Chris Alexiuk

8 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Building Scalable AI on Enterprise Data with NVIDIA Nemotron RAG and Microsoft SQL Server 2025

The article discusses the integration of NVIDIA Nemotron RAG with Microsoft SQL Server 2025, showcasing how this collaboration enables the development of scalable AI applications on enterprise data.

AzureDockerEmbeddingHTTPSSQLSQL Server

Uttara Kumar

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

Build a Log Analysis Multi-Agent Self-Corrective RAG System with NVIDIA Nemotron

The article discusses the development of an AI-powered log analysis solution using NVIDIA's Generative AI reference workflows.

EmbeddingFine-tuningGenerative AIHugging Face

Prashant Bhende

5 min read

Includes Code

Has Summary

NVIDIA

Advanced

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer

The article discusses the optimization of large language models (LLMs) through pruning and knowledge distillation using NVIDIA TensorRT Model Optimizer.

EmbeddingHugging FaceTransformer

Max Xu

10 min read

Includes Code

Has Summary

NVIDIA

Advanced

Build a Retrieval-Augmented Generation (RAG) Agent with NVIDIA Nemotron

The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) agent using NVIDIA Nemotron, emphasizing the integration of external information to enhance text genera...

DockerEmbeddingHugging FaceLangChainPythonStreamlitVector Database

Edward Li

16 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA RAPIDS 25.08 Adds New Profiler for cuML, Updates to the Polars GPU Engine, Additional Algorithm Support,

The NVIDIA RAPIDS 25.

EmbeddingPolarsPythonscikit-learn

Brian Tepera

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

Securing Agentic AI: How Semantic Prompt Injections Bypass AI Guardrails

The article discusses the emerging threat of semantic prompt injections in multimodal AI systems, highlighting how adversaries can exploit visual inputs to bypass traditional security measures.

Deep LearningEmbeddingGeminiMachine Learning

Daniel Teixeira

7 min read

Has Summary

NVIDIA

Advanced

Serverless Distributed Data Processing with Apache Spark and NVIDIA AI on Azure

The article discusses the deployment of a serverless, distributed data processing architecture using Apache Spark and NVIDIA AI on Azure.

ApacheApache SparkAzureDockerEmbeddingHTTPSHugging FacePythonREST APIServerlessSQLSQL Server

Alexander Spiridonov

9 min read

Includes Code

Has Summary

NVIDIA

Advanced

Best-in-Class Multimodal RAG: How the Llama 3.2 NeMo Retriever Embedding Model Boosts Pipeline

The article discusses the advancements in multimodal retrieval-augmented generation (RAG) systems, particularly focusing on the Llama 3. 2 NeMo Retriever Multimodal Embedding model.

EmbeddingOpenAI API

Benedikt Schifferer

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

Boost Embedding Model Accuracy for Custom Information Retrieval

The article discusses the importance of customizing embedding models for effective information retrieval, particularly in domain-specific contexts.

Embedding

Nirmal Kumar Juluru

7 min read

Has Summary

NVIDIA

Advanced

Finding the Best Chunking Strategy for Accurate AI Responses

This article discusses the importance of chunking strategies in AI retrieval systems, particularly in retrieval-augmented generation (RAG) systems.

Embedding

Steve Han

13 min read

Has Summary

NVIDIA

Advanced

Accelerating Embedding Lookups with cuEmbed

NVIDIA's cuEmbed is a high-performance, header-only CUDA library designed to accelerate embedding lookups on NVIDIA GPUs, particularly beneficial for recommendation systems.

EmbeddingPythonPyTorch

Michael Anderson

7 min read

Includes Code

Has Summary

NVIDIA

Advanced

Spotlight: Qodo Innovates Efficient Code Search with NVIDIA DGX

The article discusses how Qodo leverages NVIDIA DGX to innovate efficient code search through AI-powered agents.

EmbeddingGitLabHugging Face

Amit Bleiweiss

7 min read

Has Summary

NVIDIA

Intermediate

Developing an AI-Powered Tool for Automatic Citation Validation Using NVIDIA NIM

The article discusses the development of an AI-powered tool for automatic citation validation using NVIDIA NIM, aimed at improving the accuracy of citations in academic and AI-generated content.

EmbeddingGenerative AIGPTLangChainStreamlit

Sebastian Haan

8 min read

Has Summary

NVIDIA

Advanced

Evaluating and Enhancing RAG Pipeline Performance Using Synthetic Data

The article discusses the evaluation and enhancement of Retrieval-Augmented Generation (RAG) pipeline performance using synthetic data.

EmbeddingGenerative AISeed

Vinay Raman

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA NeMo Retriever Delivers Accurate Multimodal PDF Data Extraction 15x Faster

The article discusses the advancements in NVIDIA's NeMo Retriever, which enables accurate multimodal PDF data extraction at a speed 15 times faster than traditional methods.

AWSAWS SageMakerAzureEmbeddingGoogle Cloud

Ruchika Kharwar

10 min read

Has Summary

NVIDIA

Intermediate

Understanding PTX, the Assembly Language of CUDA GPU Computing

This article provides an in-depth understanding of Parallel Thread Execution (PTX), the assembly language for NVIDIA's CUDA GPU computing platform.

AssemblyEmbedding

Tony Scudiero

13 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Bring NVIDIA ACE AI Characters to Games with the New In-Game Inferencing SDK

The article discusses the integration of NVIDIA ACE AI characters into games using the new In-Game Inferencing SDK (NVIGI).

EmbeddingGPTMistralWhisper

Allyson Vasquez

11 min read

Includes Code

Has Summary

NVIDIA

Advanced

Mastering LLM Techniques: Evaluation

The article discusses the complexities of evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems, highlighting the inadequacy of traditional evaluation metrics.

Embedding

Amit Bleiweiss

12 min read

Has Summary

NVIDIA

Advanced

Evaluating GenMol as a Generalist Foundation Model for Molecular Generation

The article evaluates GenMol, a generalist foundation model for molecular generation, comparing it with SAFE-GPT.

BERTEmbeddingGPTOracle

Kyle Tretina

7 min read

Includes Code

Has Summary

NVIDIA

Intermediate

A Guide to Retrieval-Augmented Generation for AEC

This article provides an in-depth exploration of Retrieval-Augmented Generation (RAG) and its transformative potential for the Architecture, Engineering, and Construction (AEC) industry.

EmbeddingGenerative AIGPTHelm

Sama Bali

12 min read

Has Summary

NVIDIA

Intermediate

An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio

This article provides an introduction to building a multimodal retrieval-augmented generation (RAG) system for video and audio content.

CLIPEmbeddingFine-tuning

Tanay Varshney

11 min read

Has Summary

NVIDIA

Advanced

Three Building Blocks for Creating AI Virtual Assistants for Customer Service with an NVIDIA AI Blueprint

The article discusses the essential components for developing AI virtual assistants for customer service using NVIDIA's AI Blueprint.

EmbeddingGenerative AI

Isabel Hulseman

10 min read

Has Summary

NVIDIA

Advanced

Content Moderation and Safety Checks with NVIDIA NeMo Guardrails

The article discusses the importance of content moderation in retrieval-augmented generation (RAG) applications powered by generative AI, highlighting NVIDIA NeMo Guardrails as a toolkit for integr...

EmbeddingHugging Face

Aditi Bodhankar

10 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Hymba Hybrid-Head Architecture Boosts Small Language Model Performance

The article discusses NVIDIA's Hymba hybrid-head architecture, which combines transformer attention mechanisms with state space models to enhance the performance and efficiency of small language mo...

EmbeddingHugging FacePyTorchTransformerTransformers

Xin Dong

11 min read

Has Summary

NVIDIA

Intermediate

Advancing Neuroscience Research with Visual Question Answering and Multimodal Retrieval

The article discusses how the IIT Madras Brain Centre is leveraging generative AI, specifically visual question answering (VQA) and multimodal retrieval, to enhance neuroscience research.

EmbeddingFine-tuningHelmVector Database

Pralaypati Ta

7 min read

Has Summary

NVIDIA

Advanced

Boost Large-Scale Recommendation System Training Embedding Using EMBark

The article discusses the challenges of training large-scale deep learning recommendation models (DLRMs) and introduces EMBark, a new solution designed to optimize embedding training and reduce com...

Embedding

Shijie Liu

5 min read

Has Summary

NVIDIA

Advanced

Processing High-Quality Vietnamese Language Data with NVIDIA NeMo Curator

This article discusses the use of NVIDIA NeMo Curator for processing high-quality Vietnamese language data, highlighting the challenges faced by large language models (LLMs) in non-English language...

DaskEmbeddingHugging FacePythonYAML

Hoang Nguyen

16 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Build Multimodal Visual AI Agents Powered by NVIDIA NIM

The article discusses the development of multimodal visual AI agents using NVIDIA NIM microservices, highlighting the importance of vision-language models (VLMs) in processing and analyzing diverse...

CLIPEmbeddingFastAPIGenerative AIJSONOpenCVPythonREST APIWebSocket

Samuel Ochoa

10 min read

Has Summary

NVIDIA

Intermediate

Enhanced Security and Streamlined Deployment of AI Agents with NVIDIA AI Enterprise

The article discusses the advancements in AI agents facilitated by NVIDIA AI Enterprise, emphasizing enhanced security, streamlined deployment, and management of AI pipelines.

DGLEmbeddingGoogle CloudHelmKubernetesMistralPythonPyTorchTensorFlow

Charu Chaubal

5 min read

Has Summary

NVIDIA

Intermediate

Accelerating Oracle Database Generative AI Workloads with NVIDIA NIM and NVIDIA cuVS

The article discusses how NVIDIA and Oracle are enhancing generative AI workloads through the integration of NVIDIA's accelerated computing platform with Oracle Cloud Infrastructure.

DockerEmbeddingGenerative AIHelmKubernetesMachine LearningOraclePython

Richard Wang

6 min read

Has Summary

NVIDIA

Advanced

A Deep Dive into the Latest AI Models Optimized with NVIDIA NIM

The article discusses NVIDIA NIM microservices, which are optimized containers designed to accelerate AI application development across various domains.

EmbeddingMistralPython

Amanda Saunders

8 min read

Has Summary

NVIDIA

Advanced

Measuring Generative AI Model Performance Using NVIDIA GenAI-Perf and an OpenAI-Compatible API

The article discusses how to measure the performance of generative AI models using NVIDIA's GenAI-Perf and an OpenAI-compatible API.

EmbeddingGenerative AIJSONMistral

David Yastremsky

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

Power Text-Generation Applications with Mistral NeMo 12B Running on a Single GPU

The article discusses the Mistral NeMo 12B model, a next-generation language model developed by NVIDIA and Mistral, designed for high performance on a single GPU.

ApacheArtificial IntelligenceEmbeddingMistralPyTorchRLHFTransformer

Anjali Shah

6 min read

Includes Code

Has Summary

NVIDIA

Advanced

Develop Production-Grade Text Retrieval Pipelines for RAG with NVIDIA NeMo Retriever

The article discusses the development of production-grade text retrieval pipelines using NVIDIA NeMo Retriever, focusing on the integration of embedding and reranking models for enhanced efficiency...

EmbeddingHelmMistral

Tanay Varshney

6 min read

Has Summary

NVIDIA

Intermediate

Transforming Telco Network Operations Centers with NVIDIA NeMo Retriever and NVIDIA NIM

The article discusses how Infosys leverages NVIDIA NIM and NeMo Retriever to enhance network operations centers (NOCs) for telecom companies.

EmbeddingLangChainMistralOllamaReactVertex AI

Balamurugan Natarajan

7 min read

Has Summary

NVIDIA

Advanced

Automating Telco Network Design using NVIDIA NIM and NVIDIA NeMo

The article discusses how Infosys has automated the generation of TOSCA templates for telecom network design using NVIDIA NIM and NVIDIA NeMo.

AzureEmbeddingGPTGPT-4Hugging FaceLangChainMistralReactYAML

Balamurugan Natarajan

6 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Enhance Multi-Camera Tracking Accuracy by Fine-Tuning AI Models with Synthetic Data

The article discusses the importance of fine-tuning AI models with synthetic data to enhance multi-camera tracking accuracy. It highlights the use of NVIDIA Isaac Sim and the Omni. Replicator.

EmbeddingFine-tuningMicroservicesResNetSupervised LearningTransformer

Sameer Satish Pusegaonkar

13 min read

Includes Code

Has Summary

NVIDIA

Advanced

NVIDIA Text Embedding Model Tops MTEB Leaderboard

NVIDIA's latest embedding model, NV-Embed, achieves a record accuracy score of 69. 32 on the Massive Text Embedding Benchmark (MTEB), which encompasses 56 different embedding tasks.

BERTEmbeddingMistral

Tanay Varshney

6 min read

Has Summary

NVIDIA

Intermediate

Enhancing the Apparel Shopping Experience with AI, Emoji-Aware OCR, and Snapchat’s Screenshop

The article discusses how Snap's ML engineering team enhanced the apparel shopping experience using AI, specifically through the Screenshop service integrated into Snapchat.

DockerEmbeddingKubernetesPrometheusPythonPyTorchTensorFlow

Amr Elmeleegy

7 min read

Has Summary

NVIDIA

Advanced

Advanced AI and Retrieval-Augmented Generation for Code Development in High-Performance Computing

The article discusses the integration of advanced AI and Retrieval-Augmented Generation (RAG) techniques in high-performance computing (HPC) code development.

CopilotEmbeddingV

Harry Petty

8 min read

Includes Code

Has Summary

NVIDIA

Advanced

Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints

The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) pipeline using NVIDIA AI LangChain AI Endpoints.

EmbeddingGenerative AIgRPCHTMLJavaLangChainPythonPyTorchRetrieval Augmented GenerationTensorFlow

Amit Bleiweiss

13 min read

Includes Code

Has Summary

NVIDIA

Advanced

Optimizing Memory and Retrieval for Graph Neural Networks with WholeGraph, Part 2

This article explores the optimization of memory and retrieval processes for large-scale Graph Neural Networks (GNNs) using WholeGraph, a feature of the RAPIDS cuGraph library.

EmbeddingGraph Neural NetworksNeural Networks

Dongxu Yang

5 min read

Has Summary

NVIDIA

Advanced

An Easy Introduction to Multimodal Retrieval-Augmented Generation

This article provides an introduction to Multimodal Retrieval-Augmented Generation (RAG), emphasizing the importance of handling various data types such as text and images.

CLIPEmbeddingU-Net

Annie Surla

11 min read

Has Summary

NVIDIA

Intermediate

Optimizing Memory and Retrieval for Graph Neural Networks with WholeGraph, Part 1

The article discusses WholeGraph, a feature in the RAPIDS cuGraph library designed to optimize memory and retrieval for Graph Neural Networks (GNNs).

DGLEmbeddingGraph Neural NetworksNeural NetworksNumPyPythonPyTorch

Dongxu Yang

9 min read

Includes Code

Has Summary

NVIDIA

Intermediate

Evaluating Retriever for Enterprise-Grade RAG

The article discusses the evaluation of Retrieval-Augmented Generation (RAG) systems, emphasizing the importance of embedding models and systematic evaluation processes.

ApacheEmbeddingHugging FaceLarge Language Models

Benedikt Schifferer

14 min read

Has Summary