Performance @Scale 2016 recap

Visit the post for more.

Marty Greenia
9 min readintermediate
--
View Original

Overview

The article recaps the Performance @Scale 2016 event held at Facebook, focusing on the challenges and solutions for building fast applications that can scale to millions or billions of users. It highlights various presentations from industry leaders on topics such as system profiling, web performance, regression detection, and mobile performance.

What You'll Learn

1

How to use BPF for performance analysis on Linux systems

2

Why Time-to-Interact and Display Done are critical metrics for web performance

3

How to automate regression detection using AutoTriage at scale

4

How to leverage mobile telemetry for performance diagnostics

5

When to apply machine learning for fraud detection in advertising

Key Questions Answered

What is the role of BPF in Linux performance analysis?
BPF, or Berkeley Packet Filter, is a powerful profiling tool that allows performance engineers to run efficient profiling code directly in the kernel. It enables detailed analysis of thread blocking and wakeup events, helping to identify performance bottlenecks down to the hardware level.
How does Facebook measure web performance?
Facebook measures web performance using metrics like Time-to-Interact, which reflects the minimum usable content available to users, and Display Done, which indicates when all page content has loaded. These metrics guide their optimization efforts to enhance user experience.
What automated tools does Facebook use for regression detection?
Facebook employs AutoTriage for regression detection, which logs performance metrics and utilizes tools like Stack Trace Finder and Pushed Commit Search to identify and prioritize regressions efficiently, allowing for quick fixes without hindering developer speed.
How does LinkedIn visualize mobile performance data?
LinkedIn uses Real User Monitoring (RUM) to track mobile performance, enhancing their data collection with detailed instrumentation points. This allows engineers to visualize interactions and quickly identify systemic issues, leading to improved app performance.

Technologies & Tools

Tool
Bpf
Used for performance analysis on Linux systems.
Tool
Loom
Telemetry framework for diagnosing mobile performance issues.
Tool
Perfview
Used for client-side performance analysis in .NET applications.
Protocol
Quic
Network protocol influenced by performance findings in Chromium.

Key Actionable Insights

1
Implement BPF in your Linux performance analysis toolkit to gain deeper insights into system behavior.
BPF allows for efficient profiling and can help identify performance bottlenecks that traditional tools might miss, making it essential for high-performance applications.
2
Adopt Time-to-Interact and Display Done as key performance metrics for your web applications.
These metrics provide a clearer picture of user experience and can guide optimizations that significantly enhance perceived performance.
3
Utilize AutoTriage to streamline regression detection in your development process.
By automating the identification of performance regressions, teams can maintain high deployment speeds while ensuring code quality.
4
Leverage mobile telemetry frameworks like Loom to diagnose performance issues in mobile applications.
Detailed telemetry data can uncover hidden performance problems, allowing for targeted optimizations that improve user experience.
5
Incorporate machine learning to enhance fraud detection in advertising systems.
Machine learning can help identify suspicious accounts more efficiently, allowing human analysts to focus on the most critical cases.

Common Pitfalls

1
Failing to define clear performance metrics can lead to misaligned optimization efforts.
Without metrics like Time-to-Interact or Display Done, teams may focus on the wrong areas, resulting in suboptimal user experiences.

Related Concepts

Performance Engineering
System Profiling
Web Performance Optimization
Mobile Performance Diagnostics