Machine Learning-Powered Search Ranking of Airbnb Experiences

Mihajlo Grbovic

How we built and iterated on a machine learning Search Ranking platform for a new two-sided marketplace and how we helped it grow.

Airbnb

•

Mihajlo Grbovic

•24 min read•advanced•

--

•View Original

ApacheJavaJSONMachine Learning

Overview

The article discusses the development and iteration of a machine learning-powered search ranking platform for Airbnb Experiences, detailing its growth from a small dataset to a more complex model that incorporates personalization and online scoring. It highlights the importance of adapting the ranking model to different stages of marketplace growth and the impact of various features on booking rates.

What You'll Learn

1

How to implement a machine learning ranking model using Gradient Boosted Decision Trees

2

Why personalization is crucial for improving search ranking in diverse marketplaces

3

When to transition from offline to online scoring for real-time ranking

4

How to incorporate business rules into a ranking model to promote quality

Prerequisites & Requirements

Understanding of machine learning concepts and ranking algorithms
Familiarity with Airflow for managing data pipelines(optional)

Key Questions Answered

How did Airbnb improve its search ranking for Experiences?

Airbnb improved its search ranking for Experiences by developing a machine learning model that adapts to different growth stages of the marketplace. Initially, they used a simple random re-ranking method, then transitioned to a model based on user interactions, and later incorporated personalization and online scoring to enhance the relevance of search results.

What features were used to rank Airbnb Experiences?

The ranking model utilized various features including experience duration, price, category, reviews, number of bookings, and click-through rates. These features were critical in determining the ranking of experiences based on user interactions and preferences.

What was the impact of personalization on booking rates?

The introduction of personalization in the ranking model led to a 7.9% increase in bookings compared to the previous model. This highlights the effectiveness of tailoring search results to individual user preferences and behaviors.

How does the ranking model handle business rules?

The ranking model incorporates business rules by promoting quality experiences based on user feedback and ratings. This approach ensures that high-quality experiences are prioritized, leading to better user retention and satisfaction.

Key Statistics & Figures

Increase in bookings from Stage 1 ML model

+13%

This improvement was observed during an A/B test comparing the Stage 1 ML model to random re-ranking.

Training dataset size for Stage 1 model

50,000 examples

This dataset was collected from user interactions, specifically clicks leading to bookings.

Increase in bookings from personalization features

+7.9%

This increase was noted when comparing the new personalization model to the previous Stage 1 model.

Growth in Experiences from launch to end of 2018

20,000 active Experiences

This growth reflects the rapid expansion of Airbnb Experiences from 500 to over 20,000 Experiences in just two years.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Machine Learning

Gradient Boosted Decision Tree

Used for training the ranking model in Stage 1 and beyond.

Data Pipeline Management

Airflow

Utilized for managing the ranking pipeline and scheduling daily training and scoring tasks.

Data Visualization

Apache Superset

Used to create dashboards for monitoring and explaining ranking changes.

Key Actionable Insights

1
Implementing a machine learning ranking model can significantly enhance the discoverability of products in a marketplace.
As demonstrated by Airbnb, using a model that adapts to user interactions and preferences can lead to improved booking rates and user satisfaction.

2
Personalization features should be carefully engineered to avoid data leakage during model training.
Ensuring that personalization features are based on user interactions prior to bookings can prevent biases and improve the model's effectiveness.

3
Regularly monitor and explain ranking changes to maintain transparency with hosts.
Providing insights into how rankings are determined can help hosts understand the factors affecting their visibility and encourage them to improve their offerings.

Common Pitfalls

1

Overfitting the model to small datasets can lead to poor generalization.

Using complex models with limited data often results in a model that performs well on training data but fails in real-world scenarios. It's crucial to match model complexity to the available data.

2

Ignoring the importance of feature engineering can diminish model performance.

Without carefully selecting and engineering features based on user behavior and preferences, the model may not capture the necessary signals to improve ranking effectively.

Related Concepts

Machine Learning In E-commerce

Personalization Techniques In Search Engines

Data-driven Decision Making In Marketplaces