Airbnb at KDD 2023

KDD (Knowledge and Data Mining) is a flagship conference in data science research. Hosted annually by a special interest group of the…

Alex Deng
11 min readadvanced
--
View Original

Overview

The article discusses Airbnb's significant contributions to the KDD 2023 conference, highlighting their innovative research in deep learning, online experimentation, and causal inference. It details the acceptance of two papers, multiple presentations, and the introduction of new methodologies aimed at improving search ranking and marketing strategies.

What You'll Learn

1

How to optimize search ranking using multi-task deep learning models

2

Why variance reduction techniques are crucial for effective online experimentation

3

How to apply causal inference to improve marketing channel effectiveness

4

How to implement session-based logging for enhanced user journey tracking

Key Questions Answered

What is the Journey Ranker model and how does it improve search ranking?
The Journey Ranker is a multi-task deep learning model designed to optimize the user journey in long-term searches. It incorporates modules that help guests achieve positive milestones and avoid negative ones, utilizing a shared feature representation to enhance search ranking effectiveness.
What methods were presented for variance reduction in online experimentation?
The article discusses two methods for variance reduction: a model-based leading indicator metric that estimates progress towards delayed outcomes and a counterfactual treatment exposure index that quantifies user impact. Both methods achieved a variance reduction of 50% or more, significantly improving experimentation efficiency.
How does Airbnb address multicollinearity in marketing analysis?
Airbnb's approach involves hierarchically clustering Designated Marketing Areas (DMAs) based on marketing impressions. This method reduced cross-channel correlation by up to 43%, allowing for clearer causal inference and improved interpretability of marketing channel effects.
What are the key features of Airbnb's Onebrain data science infrastructure?
Onebrain is a data science reproducibility and reuse solution that uses YAML for project configuration. It abstracts CI/CD processes and allows for version-controlled project management, facilitating over 200 distinct projects and 500 users within a year at Airbnb.

Key Statistics & Figures

Variance reduction achieved
50% or more
This statistic applies to the two new methods presented for online experimentation at KDD 2023.
Reduction in cross-channel correlation
up to 43%
This reduction was achieved through hierarchical clustering of Designated Marketing Areas in marketing analysis.
Distinct projects using Onebrain
over 200
This reflects the rapid adoption of the Onebrain infrastructure within Airbnb's data science teams.
Users of Onebrain
over 500
This number indicates the growing engagement with the Onebrain data science infrastructure within a year.

Technologies & Tools

Data Science Infrastructure
Onebrain
Used for reproducibility and reuse of data science projects at Airbnb.

Key Actionable Insights

1
Implementing the Journey Ranker model can significantly enhance user experience by optimizing search outcomes based on user milestones.
This model is particularly effective in long-term search scenarios, making it ideal for platforms that require nuanced search ranking adjustments.
2
Utilizing variance reduction techniques in online experiments can lead to more reliable data-driven decisions.
By applying these methods, organizations can better understand the impact of changes in user experience, especially in environments with infrequent bookings.
3
Hierarchical clustering can be a powerful tool to address multicollinearity in marketing data.
This technique not only clarifies the impact of individual marketing channels but also enhances the overall interpretability of the analysis.
4
Adopting session-based logging can improve tracking of user interactions and behaviors across platforms.
This method allows for a more comprehensive understanding of user journeys, which is essential for optimizing user experience and engagement.

Common Pitfalls

1
Failing to account for interference bias in A/B testing can lead to misleading results.
This occurs when different test groups influence each other's outcomes, which can be mitigated through clustering or switchbacks.

Related Concepts

Deep Learning In Search Ranking
Online Experimentation Methodologies
Causal Inference Techniques In Marketing
Data Science Infrastructure And Reproducibility