How to Build AI Systems In House with Outerbounds and DGX Cloud Lepton

It’s easy to underestimate how many moving parts a real-world, production-grade AI system involves. Whether you’re building an agent that combines internal data…

Ville Tuulos
10 min readintermediate
--
View Original

Overview

The article discusses how to build in-house AI systems using Outerbounds and NVIDIA DGX Cloud Lepton, emphasizing the importance of orchestrating multiple models and dynamic data. It provides a detailed use case of a Reddit post stylizer and subreddit recommender, showcasing the infrastructure requirements and the benefits of operating AI stacks in-house.

What You'll Learn

1

How to leverage NVIDIA DGX Cloud Lepton for GPU access in AI systems

2

Why operating AI components in-house can enhance security and compliance

3

How to build a Reddit post stylizer and subreddit recommender using Outerbounds

Prerequisites & Requirements

  • Understanding of AI system architecture and components
  • Familiarity with NVIDIA DGX Cloud and Outerbounds platform(optional)

Key Questions Answered

What are the benefits of using Outerbounds for AI product development?
Outerbounds provides a secure, cloud-native platform that simplifies the development and operation of AI systems. It offers powerful, composable APIs for building, orchestrating, and continuously improving AI products at scale, addressing operational costs and complexities associated with in-sourcing AI components.
How does the Reddit Agent utilize embeddings and vector indices?
The Reddit Agent converts user prompts into embeddings using the nv-embedqa-e5-v5 model, which are then matched against a GPU-accelerated vector database called FAISS. This process retrieves subreddit-specific samples and reformats them using a large LLM, ensuring tailored responses for users.
What infrastructure is required for building AI systems with DGX Cloud Lepton?
Building AI systems with DGX Cloud Lepton requires access to a deep pool of GPU resources, efficient orchestration of models, and a sophisticated software stack. It integrates with various cloud partners and allows for seamless operation alongside existing infrastructure without migration.

Key Statistics & Figures

Number of subreddit-specific vector databases created
30,000
The system constructs a separate vector database for each subreddit to match samples specific to community styles.
Time to index 10 million embeddings
80 seconds
Using the NVIDIA cuVS-accelerated FAISS library on an NVIDIA H100 GPU, the system achieves rapid indexing performance.
Performance improvement of GPU over CPU
2.5x faster
The GPU-accelerated version using a single H100 is over 2x faster and cheaper than a massive CPU instance leveraging up to 60 CPU cores.

Technologies & Tools

Cloud Infrastructure
Nvidia Dgx Cloud Lepton
Provides flexible GPU access for AI system development.
Platform
Outerbounds
A cloud-native platform for developing and operating AI systems.
Database
Faiss
A GPU-accelerated vector database used for embedding matching.
Container Technology
Nvidia Nim
Used for deploying models and managing AI workflows.

Key Actionable Insights

1
Consider building your AI systems in-house to enhance control over data privacy and compliance.
As companies increasingly rely on proprietary data and models, owning key components can mitigate risks associated with external APIs and improve security.
2
Utilize the integration of Outerbounds with DGX Cloud Lepton to streamline access to GPU resources.
This integration simplifies the process of scaling AI applications, allowing developers to focus on building differentiated products without the complexity of managing multiple cloud environments.
3
Leverage the power of embeddings and vector indices for personalized user experiences in applications like the Reddit Agent.
By creating subreddit-specific vector databases, you can ensure that the content generated is relevant and tailored to the interests of users, enhancing engagement.

Common Pitfalls

1
Underestimating the complexity of orchestrating multiple AI components.
Many developers may overlook the need for a sophisticated software stack and the operational costs associated with in-sourcing AI components, which can lead to inefficiencies and increased overhead.

Related Concepts

AI System Architecture
Nvidia GPU Technologies
Data Privacy In AI
Orchestration Of AI Workflows