Performance @Scale 2018 recap

More than 275 attendees gathered to hear from performance experts from Alibaba, Facebook, Headspin, LinkedIn, Microsoft, and Netflix to talk about the biggest performance challenges they face at th…

Marty Greenia
5 min readintermediate
--
View Original

Overview

The Performance @Scale 2018 conference, hosted by Facebook, gathered over 275 performance engineering experts to discuss challenges in managing performance for large-scale services. Key topics included anomaly detection, scaling web services, and mobile app performance, featuring insights from industry leaders at companies like Alibaba, Microsoft, and LinkedIn.

What You'll Learn

1

How to utilize Execution Graphs for performance regression detection in Azure

2

Why understanding app performance in real-world conditions is crucial for developers

3

How to implement robust anomaly detection using ThirdEye for monitoring data

4

When to consider iOS VM and loader interactions to optimize app startup time

5

How to conduct mobile performance testing in real user conditions effectively

Key Questions Answered

What are Execution Graphs and how do they help in performance regression detection?
Execution Graphs are a data correlation and visualization model used in Azure to understand the performance and reliability of VM deployment operations. They help engineers debug failures and measure timing of sub-operations at scale by building complete traces for each VM operation, including timing information.
What is Profilo and how does it assist in understanding app performance?
Profilo is a high-throughput, mobile-first performance tracing library developed at Facebook. It helps developers understand app performance in real-world conditions by providing insights into what works and what doesn't, facilitating better performance optimization strategies.
How does LinkedIn's ThirdEye platform enhance anomaly detection?
ThirdEye is a generic anomaly detection platform developed by LinkedIn for time series metrics. It provides an end-to-end monitoring experience, including data-driven anomaly detection, alert tuning, and root cause investigation, which is crucial for maintaining performance across LinkedIn pages and apps.
What challenges did Alibaba face during the Global Shopping Festival?
Alibaba faced significant performance challenges during the 11/11 Global Shopping Festival, including bottlenecks in the network layer, scheduling, and distributed file systems. They processed trillions of events in real time, highlighting the need for robust infrastructure to support such scale.

Key Statistics & Figures

Number of attendees
275
The total number of performance engineering experts who gathered at the conference.
Events processed during Global Shopping Festival
trillions
The scale of events Alibaba processed in real-time during their annual shopping event.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Cloud Platform
Azure
Used for implementing Execution Graphs for performance regression detection.
Performance Tracing Library
Profilo
A high-throughput, mobile-first library developed at Facebook to understand app performance.
Anomaly Detection Platform
Thirdeye
LinkedIn's platform for monitoring and detecting anomalies in performance data.

Key Actionable Insights

1
Implementing Execution Graphs can significantly enhance your ability to detect performance regressions in distributed systems.
By visualizing the performance and reliability of operations, engineers can quickly identify and debug issues, leading to more efficient troubleshooting and improved system reliability.
2
Utilizing Profilo for mobile app performance tracing can provide critical insights into real-world app behavior.
This tool allows developers to gather performance data in various conditions, enabling them to optimize their applications based on actual user experiences rather than simulated environments.
3
Adopting ThirdEye for anomaly detection can streamline the monitoring process for engineering teams.
With its comprehensive features for alert tuning and root cause analysis, ThirdEye can help teams quickly respond to performance issues, ensuring a smoother user experience.

Common Pitfalls

1
Neglecting real-world performance conditions can lead to inaccurate app behavior assessments.
Developers often rely on simulated environments, which may not accurately reflect user experiences. Utilizing tools like Profilo can help bridge this gap.