Scaling Contextual Conversation Suggestions Over 500 Million Members

Haiyang Liu
13 min readintermediate
--
View Original

Overview

The article discusses the engineering challenges and solutions involved in scaling contextual conversation suggestions for LinkedIn's messaging platform, which serves over 500 million members. It details the use of the Economic Graph to generate recommendations and the iterative approach taken to optimize performance, liquidity, and latency.

What You'll Learn

1

How to leverage graph search problems to enhance messaging systems

2

Why optimizing for latency is critical in user-facing applications

3

When to use hybrid solutions for online recommendation systems

Prerequisites & Requirements

  • Understanding of graph theory and recommendation systems
  • Experience with large-scale data processing frameworks like Hadoop(optional)

Key Questions Answered

How does LinkedIn generate contextual conversation suggestions?
LinkedIn generates contextual conversation suggestions using its Economic Graph, which represents members and companies as nodes and their relationships as weighted edges. The system recommends connections based on the strength of these relationships, allowing users to engage meaningfully with their network.
What challenges did LinkedIn face while scaling its recommendation system?
LinkedIn faced challenges related to liquidity, cost to serve (C2S), and latency. Initially, they achieved only 20% liquidity with high storage costs and unacceptable latency. The article discusses how they iteratively improved these metrics through various engineering solutions.
What was the impact of using indirect connections on recommendation liquidity?
By including indirect connections in their recommendations, LinkedIn increased liquidity from 30% to 70%. This approach allowed for a broader set of recommendations, enhancing user engagement and connection opportunities.
How did LinkedIn optimize the performance of its recommendation system?
LinkedIn optimized performance by pre-computing affinity scores for member-company pairs and using a hybrid approach that combined offline and online computations. This reduced latency and improved the overall user experience.

Key Statistics & Figures

Initial liquidity achieved
20%
This was the liquidity rate before optimizing the recommendation system.
Final latency achieved
460ms
This was the 99th percentile latency after implementing offline computations for affinity scores.
Total members served
500 million
This is the scale at which LinkedIn operates its messaging platform.

Technologies & Tools

Data Processing
Hadoop
Used for offline computation of recommendations and affinity scores.
Backend Service
Graph Service
Facilitates graph search queries over the Economic Graph to compute recommendations.

Key Actionable Insights

1
Implementing a hybrid solution can significantly enhance the performance of recommendation systems.
By pre-computing certain data offline, you can reduce the load on real-time systems, improving response times and user satisfaction.
2
Regularly assess the liquidity of your recommendations to ensure user engagement.
If liquidity is low, consider expanding the criteria for recommendations to include indirect connections, which can increase the number of relevant suggestions.
3
Focus on optimizing latency to improve user experience in real-time applications.
Users expect quick responses; thus, ensuring that your system can deliver recommendations within acceptable timeframes is crucial for maintaining engagement.

Common Pitfalls

1
Relying too heavily on massive joins in data processing can lead to performance bottlenecks.
This occurs because large joins can create data skewness, resulting in long processing times. To avoid this, consider breaking down joins into smaller, more manageable operations.

Related Concepts

Graph Theory
Recommendation Systems
Data Processing Frameworks