How Tolan builds voice-first AI with GPT-5.1

With GPT‑5.1, Tolan built a voice app optimized for low latency, accurate context, and stable personalities as conversations evolve.

OpenAI Team
6 min readadvanced
--
View Original

Overview

The article discusses how Tolan utilizes GPT-5.1 to create a voice-first AI application that emphasizes low latency, accurate context management, and stable character personalities during conversations. It highlights the architectural choices and technological advancements that enable Tolan to provide a seamless and engaging user experience.

What You'll Learn

1

How to design a voice-first AI application that manages context effectively

2

Why low latency is critical for natural voice interactions

3

How to implement a memory system that retains user preferences and emotional cues

4

When to use real-time context reconstruction in voice applications

Key Questions Answered

How does Tolan ensure low latency in voice interactions?
Tolan reduces speech initiation time by over 0.7 seconds using OpenAI GPT-5.1 and the Responses API, which significantly enhances conversational flow and user experience. This improvement allows for more natural and responsive interactions, crucial for voice-based applications.
What architectural choices does Tolan make for context management?
Tolan rebuilds its context window from scratch each turn, utilizing a summary of recent messages, persona cards, vector-retrieved memories, and tone guidance. This approach allows for real-time adaptation to topic shifts, ensuring conversations feel seamless and natural.
How does Tolan maintain personality consistency over time?
Tolan employs a memory system that not only retains facts and preferences but also emotional signals, allowing the AI to adjust its responses based on user interactions. This system is supported by nightly compression jobs to enhance memory quality.
What are Tolan's core principles for building voice agents?
Tolan's principles include designing for conversational volatility, treating latency as part of the product experience, building memory as a retrieval system, and regenerating context each turn. These principles guide the development of their voice architecture.

Key Statistics & Figures

Monthly active users
200,000
Tolan has achieved this user base since its launch in February 2025.
App Store rating
4.8 stars
This rating reflects user satisfaction with Tolan's performance and interaction quality.
Next-day user retention increase
20%
This improvement was observed after the introduction of GPT-5.1 powered personas.
Memory recall misses reduction
30%
This decrease was based on in-product frustration signals, indicating improved memory management.

Technologies & Tools

AI/ML
Gpt-5.1
Used for generating responsive and contextually aware voice interactions.
Database
Turbopuffer
A high-speed vector database used for memory storage and retrieval.

Key Actionable Insights

1
Implement real-time context reconstruction to improve user experience in voice applications.
This technique allows the AI to adapt to changing topics mid-conversation, making interactions feel more natural and engaging for users.
2
Focus on reducing latency to enhance conversational flow.
By minimizing response times, voice agents can maintain a more human-like interaction, which is essential for user satisfaction.
3
Develop a robust memory system that captures emotional cues and user preferences.
This enables the AI to provide personalized responses that resonate with users, enhancing the overall interaction quality.
4
Design voice agents with a clear personality framework.
A well-defined character scaffold allows for consistent and relatable interactions, which can significantly improve user engagement.

Common Pitfalls

1
Relying on cached prompts can lead to disjointed conversations.
Cached prompts may not adapt well to topic shifts, resulting in a mechanical feel. Tolan's approach of rebuilding context each turn prevents this issue.

Related Concepts

Voice AI
Context Management In AI
User Experience Design In Conversational Agents