NVIDIA logo

How NVIDIA Uses Embedding

76 engineering articles about Embedding from NVIDIA's engineering team

Articles

Filter:
NVIDIA logo
NVIDIA
Advanced
Kimi K2. 5 is an advanced multimodal vision language model (VLM) developed by Kimi, optimized for various AI tasks.
Anu Srivastava
4 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article provides a comprehensive guide on building a document processing pipeline using NVIDIA Nemotron RAG, focusing on the extraction of structured data from complex documents like PDFs.
Chia-Chih Chen
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the collaboration between NVIDIA and Black Forest Labs to optimize the FLUX. 2 text-to-image model for NVIDIA Blackwell Data Center GPUs.
Sandro Cavallari
8 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the NVIDIA Multi-Agent Intelligent Warehouse (MAIW), an AI command layer designed to enhance operational efficiency and supply chain intelligence in automated warehouses.
NVIDIA logo
NVIDIA
Advanced
This article provides a comprehensive tutorial on building a voice agent using NVIDIA's Nemotron models, focusing on retrieval-augmented generation (RAG) and safety guardrails.
Chris Alexiuk
8 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the integration of NVIDIA Nemotron RAG with Microsoft SQL Server 2025, showcasing how this collaboration enables the development of scalable AI applications on enterprise data.
Uttara Kumar
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the development of an AI-powered log analysis solution using NVIDIA's Generative AI reference workflows.
Prashant Bhende
5 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the optimization of large language models (LLMs) through pruning and knowledge distillation using NVIDIA TensorRT Model Optimizer.
Max Xu
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) agent using NVIDIA Nemotron, emphasizing the integration of external information to enhance text genera...
Edward Li
16 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the emerging threat of semantic prompt injections in multimodal AI systems, highlighting how adversaries can exploit visual inputs to bypass traditional security measures.
Daniel Teixeira
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the deployment of a serverless, distributed data processing architecture using Apache Spark and NVIDIA AI on Azure.
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in multimodal retrieval-augmented generation (RAG) systems, particularly focusing on the Llama 3. 2 NeMo Retriever Multimodal Embedding model.
Benedikt Schifferer
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the importance of customizing embedding models for effective information retrieval, particularly in domain-specific contexts.
Nirmal Kumar Juluru
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses the importance of chunking strategies in AI retrieval systems, particularly in retrieval-augmented generation (RAG) systems.
Steve Han
13 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
NVIDIA's cuEmbed is a high-performance, header-only CUDA library designed to accelerate embedding lookups on NVIDIA GPUs, particularly beneficial for recommendation systems.
Michael Anderson
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how Qodo leverages NVIDIA DGX to innovate efficient code search through AI-powered agents.
Amit Bleiweiss
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the development of an AI-powered tool for automatic citation validation using NVIDIA NIM, aimed at improving the accuracy of citations in academic and AI-generated content.
Sebastian Haan
8 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the evaluation and enhancement of Retrieval-Augmented Generation (RAG) pipeline performance using synthetic data.
Vinay Raman
11 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in NVIDIA's NeMo Retriever, which enables accurate multimodal PDF data extraction at a speed 15 times faster than traditional methods.
Ruchika Kharwar
10 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article provides an in-depth understanding of Parallel Thread Execution (PTX), the assembly language for NVIDIA's CUDA GPU computing platform.
Tony Scudiero
13 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the integration of NVIDIA ACE AI characters into games using the new In-Game Inferencing SDK (NVIGI).
Allyson Vasquez
11 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the complexities of evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems, highlighting the inadequacy of traditional evaluation metrics.
Amit Bleiweiss
12 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article evaluates GenMol, a generalist foundation model for molecular generation, comparing it with SAFE-GPT.
Kyle Tretina
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article provides an in-depth exploration of Retrieval-Augmented Generation (RAG) and its transformative potential for the Architecture, Engineering, and Construction (AEC) industry.
Sama Bali
12 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article provides an introduction to building a multimodal retrieval-augmented generation (RAG) system for video and audio content.
Tanay Varshney
11 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the essential components for developing AI virtual assistants for customer service using NVIDIA's AI Blueprint.
Isabel Hulseman
10 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the importance of content moderation in retrieval-augmented generation (RAG) applications powered by generative AI, highlighting NVIDIA NeMo Guardrails as a toolkit for integr...
Aditi Bodhankar
10 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses NVIDIA's Hymba hybrid-head architecture, which combines transformer attention mechanisms with state space models to enhance the performance and efficiency of small language mo...
NVIDIA logo
NVIDIA
Intermediate
The article discusses how the IIT Madras Brain Centre is leveraging generative AI, specifically visual question answering (VQA) and multimodal retrieval, to enhance neuroscience research.
Pralaypati Ta
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the challenges of training large-scale deep learning recommendation models (DLRMs) and introduces EMBark, a new solution designed to optimize embedding training and reduce com...
Shijie Liu
5 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses the use of NVIDIA NeMo Curator for processing high-quality Vietnamese language data, highlighting the challenges faced by large language models (LLMs) in non-English language...
Hoang Nguyen
16 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the development of multimodal visual AI agents using NVIDIA NIM microservices, highlighting the importance of vision-language models (VLMs) in processing and analyzing diverse...
NVIDIA logo
NVIDIA
Intermediate
The article discusses the advancements in AI agents facilitated by NVIDIA AI Enterprise, emphasizing enhanced security, streamlined deployment, and management of AI pipelines.
NVIDIA logo
NVIDIA
Intermediate
The article discusses how NVIDIA and Oracle are enhancing generative AI workloads through the integration of NVIDIA's accelerated computing platform with Oracle Cloud Infrastructure.
NVIDIA logo
NVIDIA
Advanced
The article discusses NVIDIA NIM microservices, which are optimized containers designed to accelerate AI application development across various domains.
Amanda Saunders
8 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how to measure the performance of generative AI models using NVIDIA's GenAI-Perf and an OpenAI-compatible API.
David Yastremsky
6 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the Mistral NeMo 12B model, a next-generation language model developed by NVIDIA and Mistral, designed for high performance on a single GPU.
Anjali Shah
6 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the development of production-grade text retrieval pipelines using NVIDIA NeMo Retriever, focusing on the integration of embedding and reranking models for enhanced efficiency...
Tanay Varshney
6 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses how Infosys leverages NVIDIA NIM and NeMo Retriever to enhance network operations centers (NOCs) for telecom companies.
Balamurugan Natarajan
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how Infosys has automated the generation of TOSCA templates for telecom network design using NVIDIA NIM and NVIDIA NeMo.
Balamurugan Natarajan
6 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the importance of fine-tuning AI models with synthetic data to enhance multi-camera tracking accuracy. It highlights the use of NVIDIA Isaac Sim and the Omni. Replicator.
Sameer Satish Pusegaonkar
13 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
NVIDIA's latest embedding model, NV-Embed, achieves a record accuracy score of 69. 32 on the Massive Text Embedding Benchmark (MTEB), which encompasses 56 different embedding tasks.
Tanay Varshney
6 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses how Snap's ML engineering team enhanced the apparel shopping experience using AI, specifically through the Screenshop service integrated into Snapchat.
NVIDIA logo
NVIDIA
Advanced
The article discusses the integration of advanced AI and Retrieval-Augmented Generation (RAG) techniques in high-performance computing (HPC) code development.
Harry Petty
8 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) pipeline using NVIDIA AI LangChain AI Endpoints.
NVIDIA logo
NVIDIA
Advanced
This article explores the optimization of memory and retrieval processes for large-scale Graph Neural Networks (GNNs) using WholeGraph, a feature of the RAPIDS cuGraph library.
Dongxu Yang
5 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article provides an introduction to Multimodal Retrieval-Augmented Generation (RAG), emphasizing the importance of handling various data types such as text and images.
Annie Surla
11 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses WholeGraph, a feature in the RAPIDS cuGraph library designed to optimize memory and retrieval for Graph Neural Networks (GNNs).
Dongxu Yang
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the evaluation of Retrieval-Augmented Generation (RAG) systems, emphasizing the importance of embedding models and systematic evaluation processes.
Benedikt Schifferer
14 min read
Has Summary
--