How Microsoft Engineers Build AI: Learn about scalable RAG-enabled AI Apps

Curious about building scalable RAG-enabled AI apps? Learn how Microsoft engineers built Ask Learn in Copilot for Azure. Watch now for expert insights.

Krezzia
3 min readintermediate
--
View Original

Overview

Microsoft introduces a new video series 'How Microsoft Engineers Build AI' starting with how their engineering teams built the Ask Learn RAG plugin for Copilot for Azure. The article covers practical insights on implementing retrieval-augmented generation (RAG) at scale, including content selection, preprocessing challenges, and performance evaluation techniques used by Microsoft's engineering teams.

What You'll Learn

1

How RAG differs from other model improvement techniques like fine-tuning

2

How to apply RAG in production applications like Microsoft Copilot and Dynamics 365

3

How to handle content selection and preprocessing challenges when building RAG systems

4

How to evaluate RAG performance and ensure accurate, up-to-date responses

Prerequisites & Requirements

  • Basic understanding of large language models (LLMs) and how they generate responses
  • Familiarity with AI application development concepts
  • Some experience building applications that interact with APIs or cloud services(optional)

Key Questions Answered

What is RAG and how does it differ from fine-tuning for improving AI models?
RAG (retrieval-augmented generation) is a technique that surfaces accurate, contextually relevant information by leveraging proprietary data combined with large language models. Unlike fine-tuning, which modifies the model itself, RAG augments the model's responses at query time by retrieving relevant data from external sources, making it easier to keep responses current without retraining.
How did Microsoft build the Ask Learn RAG plugin for Copilot in Azure?
Microsoft's engineering team, including Sr. Product Managers Brian Steggeman and Eric Imasogie, and Principal Software Engineer Manager Tianqi Zhang, built the Ask Learn plugin by implementing RAG to surface accurate, contextually relevant information from Microsoft Learn documentation. They addressed challenges in content selection, preprocessing, and performance evaluation to deliver reliable answers to Azure developers directly in their workflow.
What are the main challenges when building RAG applications at scale?
The key challenges include content selection (choosing which data sources to index), preprocessing content into formats suitable for retrieval, evaluating RAG performance to ensure accuracy, and maintaining up-to-date responses as source content changes. Microsoft's team found that with the right guidance and best practices these challenges become manageable, but they require careful planning and iterative testing.
Which Microsoft products use RAG-based AI features?
Microsoft applies RAG across several products including Copilot in Azure (via the Ask Learn plugin), Microsoft Security Copilot, and Dynamics 365 Business Central. Each product leverages RAG to provide contextually relevant answers by combining proprietary data sources with LLM capabilities, tailored to their specific domain and user needs.
How does the Ask Learn plugin help Azure developers in their workflow?
The Ask Learn plugin enables Azure developers to get answers in seconds directly within their workflow by querying Microsoft Learn documentation through a RAG-based knowledge service. Instead of leaving their development environment to search documentation manually, developers can ask questions and receive accurate, contextually relevant responses powered by retrieval-augmented generation.

Key Statistics & Figures

Companies planning AI investment over next three years
92%
According to McKinsey, 92% of companies plan on investing in AI to achieve business outcomes like enhancing productivity and delivering better customer service

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

AI Technique
Rag (retrieval-augmented Generation)
Core technique for building the Ask Learn plugin to surface accurate, contextually relevant information
Cloud Platform
Azure
Cloud platform where the Copilot and Ask Learn plugin are deployed
AI Assistant
Copilot For Azure
Microsoft's AI assistant that hosts the Ask Learn RAG plugin
AI Assistant
Microsoft Security Copilot
Security-focused AI product that uses RAG techniques
Business Application
Dynamics 365 Business Central
Business application that incorporates RAG-based AI features
AI/ML
Llms (large Language Models)
Foundation models combined with RAG retrieval to generate contextually relevant responses
IDE
Visual Studio
Recommended IDE for building RAG applications, now includes GitHub Copilot
AI Coding Assistant
Github Copilot
AI coding assistant integrated into Visual Studio for development assistance

Key Actionable Insights

1
Start with prototyping when building RAG applications rather than trying to build a production-ready system immediately. Microsoft's team emphasizes the importance of prototyping to validate your RAG approach before scaling, as content selection and preprocessing decisions made early have cascading effects on system quality.
The Ask Learn team shares their experiences iterating through prototypes to refine their RAG pipeline before deploying to production across Copilot for Azure.
2
Pay careful attention to content selection and preprocessing as these are critical success factors for RAG systems. The quality of your retrieved content directly determines the quality of your AI-generated responses, so invest time in curating and preparing your data sources before focusing on model optimization.
Microsoft's team highlights content selection and preprocessing as key challenges they faced when building the Ask Learn plugin, suggesting these are common pitfalls for RAG developers.
3
Implement robust RAG performance evaluation mechanisms to ensure your system delivers accurate and up-to-date responses. Without proper evaluation, RAG systems can degrade over time as source data changes or retrieval quality drifts, leading to incorrect or outdated answers.
The Ask Learn team developed innovative solutions to ensure their plugin delivers accurate responses, emphasizing that evaluation is an ongoing process rather than a one-time setup.
4
Consider RAG over fine-tuning when you need to leverage proprietary or frequently changing data. RAG allows you to keep responses current by updating the retrieval index rather than retraining the entire model, making it more practical for enterprise scenarios where data freshness matters.
The article distinguishes RAG from fine-tuning as a model improvement technique, with Microsoft choosing RAG specifically for its ability to surface accurate, contextually relevant information from their documentation.

Common Pitfalls

1
Poor content selection can undermine the entire RAG system. Choosing the wrong data sources or including too much irrelevant content in your retrieval index leads to noisy, inaccurate responses that erode user trust.
Microsoft's team specifically highlights content selection as one of the key challenges they faced, suggesting developers should carefully curate which documents and data sources are indexed.
2
Inadequate preprocessing of content before indexing results in poor retrieval quality. Raw content often contains formatting artifacts, irrelevant sections, or inconsistent structures that confuse the retrieval system and degrade response quality.
The Ask Learn team encountered preprocessing challenges that required innovative solutions, indicating this is a common stumbling block for RAG implementations.
3
Neglecting RAG performance evaluation leads to a system that may appear to work in demos but fails in production scenarios. Without systematic evaluation, it's difficult to identify when the system is returning stale, incorrect, or irrelevant information.
The article emphasizes that building RAG reliably at scale requires proper evaluation mechanisms, and the Microsoft team developed specific approaches to measure and maintain response quality.

Related Concepts

Retrieval-augmented Generation
Fine-tuning
Large Language Models
AI Application Development
Data Preprocessing
Content Indexing
Vector Search
Enterprise AI
Ai-powered Documentation
Knowledge Retrieval Systems