The Top 2019 LinkedIn Engineering Blogs

Jaren Anderson
8 min readadvanced
--
View Original

Overview

The article highlights the top ten engineering blogs from LinkedIn in 2019, focusing on popular topics such as open source, artificial intelligence, and technical challenges at scale. It provides insights into various engineering practices and innovations that LinkedIn has implemented over the year.

What You'll Learn

1

How to implement Access Control Lists (ACLs) for data security at scale

2

Why a plug-and-play ecosystem improves machine learning productivity

3

How to optimize feed algorithms using machine learning techniques

4

When to use Apache Kafka for handling large-scale messaging

Key Questions Answered

How does LinkedIn manage data access with ACLs?
LinkedIn employs Access Control Lists (ACLs) to ensure that data is accessed only when there is a valid business case. The blog post discusses techniques like caching and centralized control of ACLs to efficiently manage the large number of ACLs as the platform scales.
What is the purpose of Pro-ML at LinkedIn?
Pro-ML is designed to improve the efficiency of machine learning work at LinkedIn by creating a standardized, plug-and-play ecosystem. This program automates key components of machine learning processes, making them more accessible to a wider range of engineers.
What innovations have been introduced in Pinot since its open sourcing?
Since being open sourced, Pinot has introduced several innovations, including a filesystem abstraction for preferred storage backends and new indexing techniques. These enhancements aim to improve real-time analytics capabilities.
How does LinkedIn optimize its feed for user engagement?
LinkedIn uses a two-pass architecture for its feed optimization, employing first pass rankers to generate candidate posts and second pass rankers to score these candidates. This approach allows for multi-objective optimization while maintaining low latency.

Key Statistics & Figures

Messages handled by LinkedIn’s Kafka deployments
7 trillion per day
This statistic highlights the scale at which LinkedIn operates its messaging infrastructure, showcasing the need for customized solutions to handle such a volume.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
Kafka
Used for handling large-scale messaging and data streaming at LinkedIn.
Machine Learning
Pro-ml
A framework developed to improve machine learning productivity and accessibility.
Data Storage
Pinot
A scalable, distributed OLAP data store for real-time analytics.
Data Streaming
Brooklin
A service for near real-time data streaming and change data capture.

Key Actionable Insights

1
Implementing ACLs effectively is crucial for maintaining data security as your application scales. Utilize caching techniques to improve access times and reduce load on your ACL management system.
As applications grow, the number of services and data access requests increases significantly. Efficiently managing ACLs ensures that data remains secure while still being accessible to authorized users.
2
Adopting a standardized machine learning framework like Pro-ML can drastically improve productivity across teams. This approach allows for easier collaboration and faster deployment of machine learning models.
When teams use disparate systems, it can lead to inefficiencies and difficulties in scaling. A unified framework streamlines processes and enhances the overall quality of machine learning outputs.
3
Utilizing a two-pass ranking system for content feeds can enhance user engagement by delivering more relevant content. This method allows for continuous optimization based on user interactions.
In competitive environments, providing users with the most relevant content can significantly impact retention and engagement metrics.

Common Pitfalls

1
Failing to standardize machine learning systems can lead to inefficiencies and difficulties in scaling.
When teams create bespoke systems, it becomes challenging to maintain and integrate them, which can slow down innovation and productivity.