Airbnb at KDD 2023

Alex Deng

KDD (Knowledge and Data Mining) is a flagship conference in data science research. Hosted annually by a special interest group of the…

Airbnb

•

Alex Deng

•11 min read•advanced•

--

•View Original

Machine LearningYAML

Overview

The article discusses Airbnb's significant contributions to the KDD 2023 conference, highlighting their innovative research in deep learning, online experimentation, and causal inference. It details the acceptance of two papers, multiple presentations, and the introduction of new methodologies aimed at improving search ranking and marketing strategies.

What You'll Learn

1

How to optimize search ranking using multi-task deep learning models

2

Why variance reduction techniques are crucial for effective online experimentation

3

How to apply causal inference to improve marketing channel effectiveness

4

How to implement session-based logging for enhanced user journey tracking

Key Questions Answered

What is the Journey Ranker model and how does it improve search ranking?

The Journey Ranker is a multi-task deep learning model designed to optimize the user journey in long-term searches. It incorporates modules that help guests achieve positive milestones and avoid negative ones, utilizing a shared feature representation to enhance search ranking effectiveness.

What methods were presented for variance reduction in online experimentation?

The article discusses two methods for variance reduction: a model-based leading indicator metric that estimates progress towards delayed outcomes and a counterfactual treatment exposure index that quantifies user impact. Both methods achieved a variance reduction of 50% or more, significantly improving experimentation efficiency.

How does Airbnb address multicollinearity in marketing analysis?

Airbnb's approach involves hierarchically clustering Designated Marketing Areas (DMAs) based on marketing impressions. This method reduced cross-channel correlation by up to 43%, allowing for clearer causal inference and improved interpretability of marketing channel effects.

What are the key features of Airbnb's Onebrain data science infrastructure?

Onebrain is a data science reproducibility and reuse solution that uses YAML for project configuration. It abstracts CI/CD processes and allows for version-controlled project management, facilitating over 200 distinct projects and 500 users within a year at Airbnb.

Key Statistics & Figures

Variance reduction achieved

50% or more

This statistic applies to the two new methods presented for online experimentation at KDD 2023.

Reduction in cross-channel correlation

up to 43%

This reduction was achieved through hierarchical clustering of Designated Marketing Areas in marketing analysis.

Distinct projects using Onebrain

over 200

This reflects the rapid adoption of the Onebrain infrastructure within Airbnb's data science teams.

Users of Onebrain

over 500

This number indicates the growing engagement with the Onebrain data science infrastructure within a year.

Technologies & Tools

Data Science Infrastructure

Onebrain

Used for reproducibility and reuse of data science projects at Airbnb.

Key Actionable Insights

1
Implementing the Journey Ranker model can significantly enhance user experience by optimizing search outcomes based on user milestones.
This model is particularly effective in long-term search scenarios, making it ideal for platforms that require nuanced search ranking adjustments.

2
Utilizing variance reduction techniques in online experiments can lead to more reliable data-driven decisions.
By applying these methods, organizations can better understand the impact of changes in user experience, especially in environments with infrequent bookings.

3
Hierarchical clustering can be a powerful tool to address multicollinearity in marketing data.
This technique not only clarifies the impact of individual marketing channels but also enhances the overall interpretability of the analysis.

4
Adopting session-based logging can improve tracking of user interactions and behaviors across platforms.
This method allows for a more comprehensive understanding of user journeys, which is essential for optimizing user experience and engagement.

Common Pitfalls

1

Failing to account for interference bias in A/B testing can lead to misleading results.

This occurs when different test groups influence each other's outcomes, which can be mitigated through clustering or switchbacks.

Related Concepts

Deep Learning In Search Ranking

Online Experimentation Methodologies

Causal Inference Techniques In Marketing

Data Science Infrastructure And Reproducibility