Node.js in Flames

Netflix Technology Blog
9 min readbeginner
--
View Original

Overview

The article discusses Netflix's experience with performance tuning their Node.js web application, highlighting issues with request latency and CPU usage. It details the investigation into these problems, the use of CPU flame graphs for analysis, and the eventual resolution by addressing the misuse of the Express.js API.

What You'll Learn

1

How to use CPU flame graphs for performance analysis in Node.js applications

2

Why understanding your dependencies is crucial before production deployment

3

How to identify and resolve performance bottlenecks in Express.js applications

Prerequisites & Requirements

  • Familiarity with Node.js and Express.js
  • Experience with performance analysis tools like CPU flame graphs(optional)

Key Questions Answered

What were the causes of increasing request latencies in the Node.js application?
The increasing request latencies were primarily caused by the use of a global array in Express.js for route handlers, leading to O(n) lookups. Additionally, duplicate route handlers were inadvertently added to the array, which compounded the latency issue as requests had to traverse more handlers over time.
How did Netflix resolve the performance issues in their Node.js application?
Netflix resolved the performance issues by identifying the root cause through CPU flame graphs, which revealed that duplicate route handlers were being added to the Express.js handler array. After fixing the code to prevent duplicate handlers, the latency and CPU usage stabilized.
What is the significance of using flame graphs for performance tuning?
Flame graphs provide a visual representation of CPU usage, allowing developers to identify which functions consume the most time. This insight is crucial for diagnosing performance bottlenecks and optimizing application performance effectively.
What assumptions did Netflix make about Express.js that led to performance issues?
Netflix assumed that the Express.js API would handle route lookups efficiently without understanding its underlying implementation. This led to misuse of the API, specifically using a global array for route handlers, which resulted in performance degradation over time.

Key Statistics & Figures

Initial request latency
1 ms
Request latencies started at 1 ms but increased by 10 ms every hour.
Increased latency peak
60 ms
Request latencies peaked at around 60 ms before instances were rebooted.
Heap size during testing
1.2 Gb
The process's heap size remained fairly constant at around 1.2 Gb during performance testing.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement CPU flame graphs in your Node.js applications to visualize performance issues.
Using flame graphs can help you pinpoint where your application spends most of its CPU time, making it easier to identify bottlenecks and optimize performance.
2
Regularly review and understand the dependencies in your application stack.
By fully understanding the libraries and frameworks you use, you can avoid pitfalls that may arise from incorrect assumptions about their behavior, leading to better performance and stability.
3
Avoid using global arrays for route handlers in Express.js applications.
Using a more efficient data structure, like a map, can significantly reduce lookup times and improve the performance of your application.

Common Pitfalls

1
Assuming that the Express.js API will handle route lookups efficiently without understanding its implementation.
This can lead to performance issues, as seen with Netflix's experience where they used a global array for route handlers, resulting in O(n) lookups and increased latencies.
2
Not monitoring the growth of route handler arrays in Express.js applications.
Failing to track the number of handlers can lead to performance degradation over time, as duplicate handlers can accumulate and slow down request processing.

Related Concepts

Performance Tuning In Node.js
Express.js Routing Mechanisms
Using Profiling Tools For Application Optimization