Discovering and Classifying In-app Message Intent at Airbnb

Michelle (Guqian) Du

Conversational AI is inspiring us to rethink the customer experience on our platform.

Airbnb

•

Michelle (Guqian) Du

•13 min read•advanced•

--

•View Original

BERTMachine Learning

Overview

The article discusses how Airbnb utilizes AI/ML to enhance the messaging experience between guests and hosts by classifying in-app message intents. It details a two-phase machine learning framework that combines unsupervised and supervised learning techniques to improve communication efficiency and reduce response times.

What You'll Learn

1

How to implement a machine learning framework for message intent classification

2

Why using Latent Dirichlet Allocation (LDA) is effective for intent discovery

3

How to improve labeling quality through iterative processes

Prerequisites & Requirements

Understanding of machine learning concepts and natural language processing
Familiarity with Python and machine learning libraries(optional)

Key Questions Answered

How does Airbnb classify in-app message intents?

Airbnb classifies in-app message intents using a two-phase machine learning framework. The first phase employs Latent Dirichlet Allocation (LDA) to discover potential topics from messages, while the second phase utilizes supervised learning techniques with a Convolutional Neural Network (CNN) to classify messages based on the identified intents.

What role does LDA play in the intent discovery process?

LDA is used to identify existing topics within the messaging corpus without prior knowledge. It allows for the probabilistic modeling of topics in messages, accommodating the presence of multiple intents within a single message, which is common in Airbnb's data.

What challenges are faced in labeling message intents?

Labeling message intents presents challenges such as ensuring high-quality labels and addressing multi-intent messages. The process involves iterative refinement based on product feedback and inter-rater agreement to improve accuracy and reduce human error.

What applications are planned for the intent classification framework?

The intent classification framework is set to enhance various applications, including predicting customer support issues, guiding users through cancellation and payment processes, improving booking experiences, and providing instant smart responses based on guest and host needs.

Key Statistics & Figures

Overall accuracy of the Phase-1&2 solution

around 70%

This accuracy significantly outperforms the Phase-1 only solution by 50–100%.

Percentage of target messages with multi-intent

about 13%

This indicates the complexity of user inquiries and the need for nuanced classification.

Technologies & Tools

Machine Learning

Latent Dirichlet Allocation

Used for discovering potential topics in the messaging corpus.

Machine Learning

Convolutional Neural Network

Employed for classifying messages based on identified intents.

ML Infrastructure

Bighead

Utilized for productionizing the machine learning framework.

Key Actionable Insights

1
Implementing a two-phase machine learning framework can significantly enhance message intent classification accuracy.
By combining unsupervised learning for intent discovery with supervised learning for classification, organizations can better understand and respond to user inquiries, leading to improved customer satisfaction.

2
Investing in high-quality labeling processes is crucial for model performance.
Ensuring that labels are defined clearly and consistently can reduce human error and improve the accuracy of machine learning models, which is vital for applications relying on precise intent classification.

3
Utilizing LDA for topic discovery can help identify complex message intents effectively.
LDA's probabilistic approach allows for the detection of multiple intents within a single message, which is essential for platforms like Airbnb where users often convey several inquiries at once.

Common Pitfalls

1

Misclassifications can arise from human errors in labeling, where labelers may misinterpret the intent of messages.

To avoid this, it is essential to provide clear definitions and training for labelers, ensuring they understand the nuances of different message types.

2

Label ambiguity can lead to confusion in categorizing messages, especially when a single message contains multiple intents.

Establishing a robust process for identifying and categorizing multi-intent messages can help mitigate this issue and improve classification accuracy.

Related Concepts

Natural Language Processing

Machine Learning

Intent Classification

Text Preprocessing