#

Embedding Programming Tutorials & Engineering Articles

173 Embedding tutorials, guides, and engineering insights from NVIDIA, Pinterest, Google, and more

Embedding Articles & Tutorials

Filter:
Uber logo
Uber
Advanced
Uber’s Rate Limiting System details the evolution of Uber's approach to managing service overload through a unified rate-limiting architecture.
Chien-Chih Liao, Rahul Gutal, Smit Sheth, Ying Jiang
14 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
Kimi K2. 5 is an advanced multimodal vision language model (VLM) developed by Kimi, optimized for various AI tasks.
Anu Srivastava
4 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article provides a comprehensive guide on building a document processing pipeline using NVIDIA Nemotron RAG, focusing on the extraction of structured data from complex documents like PDFs.
Chia-Chih Chen
9 min read
Includes Code
Has Summary
--
OpenAI logo
OpenAI
Intermediate
PVH Corp. , the parent company of Calvin Klein and Tommy Hilfiger, announced its adoption of ChatGPT Enterprise to transform its global fashion operations.
OpenAI
3 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the collaboration between NVIDIA and Black Forest Labs to optimize the FLUX. 2 text-to-image model for NVIDIA Blackwell Data Center GPUs.
Sandro Cavallari
8 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the NVIDIA Multi-Agent Intelligent Warehouse (MAIW), an AI command layer designed to enhance operational efficiency and supply chain intelligence in automated warehouses.
NVIDIA logo
NVIDIA
Advanced
This article provides a comprehensive tutorial on building a voice agent using NVIDIA's Nemotron models, focusing on retrieval-augmented generation (RAG) and safety guardrails.
Chris Alexiuk
8 min read
Includes Code
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's transition from traditional keyword-based search using Apache Lucene to implementing semantic vector search with Amazon OpenSearch.
Hao Sun, Jiasen Xu, Smit Patel, Anand Kotriwal, Xu Zhang
11 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses the evolution and scaling of Uber's Delivery Search Platform, emphasizing the transition from traditional lexical search to a semantic search model that enhances user experien...
Divya Nagar, Zheng Liu, Jiasen Xu, Bo Ling, Haoyang Chen
11 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the integration of NVIDIA Nemotron RAG with Microsoft SQL Server 2025, showcasing how this collaboration enables the development of scalable AI applications on enterprise data.
Uttara Kumar
10 min read
Includes Code
Has Summary
--
Pinterest logo
Pinterest
Advanced
The article reflects on a decade of AI platform development at Pinterest, detailing the evolution from fragmented machine learning stacks to a unified AI platform that supports various models.
Uber logo
Uber
Advanced
This article discusses how Uber has integrated explainability into its machine learning platform, Michelangelo, using Integrated Gradients (IG) to provide interpretable attributions for deep learni...
Hugh Chen, Eric Wang, Gaoyuan Huang, Howard Yu, Jia Li, Sally Lee
14 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the development of an AI-powered log analysis solution using NVIDIA's Generative AI reference workflows.
Prashant Bhende
5 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the optimization of large language models (LLMs) through pruning and knowledge distillation using NVIDIA TensorRT Model Optimizer.
Max Xu
10 min read
Includes Code
Has Summary
--
Google logo
Google
Intermediate
The article provides an in-depth exploration of the EmbeddingGemma architecture, detailing its origins, embedding generation process, and the comprehensive training methodology.
Henrique Schechter Vera, Juyeong Ji, Sahil Dua
7 min read
Includes Code
Has Summary
--
Cloudflare logo
Cloudflare
Advanced
The article discusses the concept of AI sovereignty, emphasizing the importance of choice for nations in controlling AI technologies and data.
Carly Ramsey
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article provides a comprehensive guide on building a Retrieval-Augmented Generation (RAG) agent using NVIDIA Nemotron, emphasizing the integration of external information to enhance text genera...
Edward Li
16 min read
Includes Code
Has Summary
--
Google logo
Google
Beginner
The article discusses the recent enhancements to the Gemini Batch API, which now includes support for the Gemini Embedding model and compatibility with the OpenAI SDK.
Lucia Loher, Patrick Löber
2 min read
Includes Code
Has Summary
--
Google logo
Google
Intermediate
This article discusses the integration of Google's EmbeddingGemma model with Google Cloud's Dataflow to create a scalable embedding pipeline for AI applications.
Danny McCormick, Ian Ballantyne, Olivier Lacombe
5 min read
Includes Code
Has Summary
--
Google logo
Google
Intermediate
EmbeddingGemma is an innovative open embedding model designed for on-device AI applications, featuring 308 million parameters for efficient performance.
Netflix logo
Netflix
Advanced
The article discusses the evolution of data engineering at Netflix, focusing on the introduction of Media ML Data Engineering, which aims to enhance the handling of complex media data for machine l...
Netflix Technology Blog
7 min read
Has Summary
--
LinkedIn logo
LinkedIn
Advanced
The article discusses the evolution of LinkedIn's edge-building system, focusing on how it leverages AI-powered recommendations to enhance user interactions.
Yi-Wen Liu
13 min read
Has Summary
--
Ramp logo
Ramp
Intermediate
This article explores the rise of Forward Deployed Engineering (FDE) as a strategic role in B2B tech companies, tracing its origins from Palantir to its current adoption across companies like OpenA...
Leo Mehr
13 min read
Has Summary
--
OpenAI logo
OpenAI
Advanced
The article introduces gpt-oss, two state-of-the-art open-weight language models, gpt-oss-120b and gpt-oss-20b, which excel in reasoning tasks and are optimized for deployment on consumer hardware.
NVIDIA logo
NVIDIA
Advanced
The article discusses the emerging threat of semantic prompt injections in multimodal AI systems, highlighting how adversaries can exploit visual inputs to bypass traditional security measures.
Daniel Teixeira
7 min read
Has Summary
--
Google logo
Google
Intermediate
The article discusses the Gemini Embedding text model and its applications in various industries, highlighting its effectiveness in enhancing AI applications through context engineering and retriev...
Vishal Dharmadhikari, Janie Zhang
4 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the deployment of a serverless, distributed data processing architecture using Apache Spark and NVIDIA AI on Azure.
Google logo
Google
Intermediate
The article announces the general availability of the Gemini Embedding text model, gemini-embedding-001, in the Gemini API and Vertex AI.
Min Choi, Janie Zhang
3 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in multimodal retrieval-augmented generation (RAG) systems, particularly focusing on the Llama 3. 2 NeMo Retriever Multimodal Embedding model.
Benedikt Schifferer
7 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the importance of customizing embedding models for effective information retrieval, particularly in domain-specific contexts.
Nirmal Kumar Juluru
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
This article discusses the importance of chunking strategies in AI retrieval systems, particularly in retrieval-augmented generation (RAG) systems.
Steve Han
13 min read
Has Summary
--
Pinterest logo
Pinterest
Advanced
The article discusses the implementation of Offline Approximate Nearest Neighbors (ANN) at Pinterest to improve ad retrieval efficiency.
Pinterest Engineering
7 min read
Has Summary
--
Google logo
Google
Intermediate
The article discusses the latest updates to the Gemini API, highlighting new models and functionalities that enhance developers' ability to create applications using generative AI.
Shrestha Basu Mallick, Logan Kilpatrick, Alisa Fortin, Ivan Solovyev
7 min read
Includes Code
Has Summary
--
LinkedIn logo
LinkedIn
Advanced
The article discusses JUDE, LinkedIn's platform for generating high-quality embeddings for job recommendations using fine-tuned Large Language Models (LLMs).
NVIDIA logo
NVIDIA
Advanced
NVIDIA's cuEmbed is a high-performance, header-only CUDA library designed to accelerate embedding lookups on NVIDIA GPUs, particularly beneficial for recommendation systems.
Michael Anderson
7 min read
Includes Code
Has Summary
--
Google logo
Google
Advanced
The article introduces Keras Recommenders, a new library designed to simplify the creation of state-of-the-art recommendation systems using Keras with JAX, TensorFlow, or PyTorch.
Yufeng Guo, Monica Song
3 min read
Includes Code
Has Summary
--
Google logo
Google
Intermediate
The article discusses the new features and improvements in Gemma 3, highlighting its vision-language capabilities, architectural changes for memory efficiency, and enhanced multilingual support.
Ju-yeong Ji, Ravin Kumar
9 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses how Qodo leverages NVIDIA DGX to innovate efficient code search through AI-powered agents.
Amit Bleiweiss
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the development of an AI-powered tool for automatic citation validation using NVIDIA NIM, aimed at improving the accuracy of citations in academic and AI-generated content.
Sebastian Haan
8 min read
Has Summary
--
Cloudflare logo
Cloudflare
Intermediate
The article introduces AutoRAG, a fully managed Retrieval-Augmented Generation (RAG) pipeline available in open beta on Cloudflare.
Anni Wang
11 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the evaluation and enhancement of Retrieval-Augmented Generation (RAG) pipeline performance using synthetic data.
Vinay Raman
11 min read
Includes Code
Has Summary
--
Uber logo
Uber
Intermediate
This article discusses how Uber enhances personalized CRM communication using contextual bandit strategies, particularly focusing on the application of AI/ML techniques to optimize email content.
LJ (Lin) He, Yifeng Wu, Gaurav Jindal
13 min read
Has Summary
--
Cloudflare logo
Cloudflare
Advanced
The article explores the emerging field of cryptographic watermarking for AI-generated content, discussing its importance in identifying the origins of digital artifacts.
Teresa Brooks-Mejia
24 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Advanced
The article discusses the advancements in NVIDIA's NeMo Retriever, which enables accurate multimodal PDF data extraction at a speed 15 times faster than traditional methods.
Ruchika Kharwar
10 min read
Has Summary
--
Airbnb logo
Airbnb
Advanced
The article discusses the development of Airbnb's first Embedding-Based Retrieval (EBR) search system, which aims to improve the relevance of search results for users by narrowing down the pool of ...
Huiji Gao
7 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article provides an in-depth understanding of Parallel Thread Execution (PTX), the assembly language for NVIDIA's CUDA GPU computing platform.
Tony Scudiero
13 min read
Includes Code
Has Summary
--
Google logo
Google
Beginner
The article discusses the introduction of the Gemini Embedding text model (gemini-embedding-exp-03-07) available through the Gemini API.
Logan Kilpatrick, Zach Gleicher, Parashar Shah
3 min read
Includes Code
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
The article discusses the integration of NVIDIA ACE AI characters into games using the new In-Game Inferencing SDK (NVIGI).
Allyson Vasquez
11 min read
Includes Code
Has Summary
--
Pinterest logo
Pinterest
Intermediate
The article discusses advancements in embedding-based retrieval at Pinterest's Homefeed, focusing on improvements such as feature crossing, ID embeddings, and serving corpus upgrades.
Pinterest Engineering
8 min read
Has Summary
--