Analyzing Time Series for Pinterest Observability

Pinterest Engineering
10 min readintermediate
--
View Original

Overview

The article discusses the importance of time series data in observability at Pinterest, detailing the development of TScript, a domain-specific language designed to manipulate time series data efficiently. It covers the background of time series solutions at Pinterest, the design goals of TScript, its features, and practical usage examples.

What You'll Learn

1

How to use TScript for time series data manipulation

2

Why TScript is beneficial for alerting in observability

3

How to implement anomaly detection using TScript and Prophet

Prerequisites & Requirements

  • Understanding of time series data concepts
  • Familiarity with pandas and pyparsing libraries(optional)

Key Questions Answered

What is TScript and how is it used at Pinterest?
TScript is a domain-specific language developed at Pinterest for manipulating time series data. It allows engineers to perform operations on time series data in a readable and efficient manner, supporting over 30,000 expressions across various dashboards and alerts.
How does TScript improve the readability of time series operations?
TScript enhances readability by allowing multi-line input and object-oriented syntax, making it easier for users to understand complex operations compared to traditional nested function approaches.
What are the main features of TScript?
Key features of TScript include variables as input, multi-line support, object-oriented operations, assignments, filtering capabilities, and built-in alerting mechanisms, all designed to facilitate efficient time series data manipulation.
What challenges did Pinterest face when implementing TScript?
Pinterest faced challenges in converting time series data from the database into a format suitable for TScript operations. Initially, creating individual DataFrames for each series was inefficient, leading to performance issues, which were resolved by preallocating memory for the entire metric DataFrame.

Key Statistics & Figures

Number of alerts powered by time series data
60,000
This highlights the scale at which Pinterest relies on time series for observability.
Number of dashboards utilizing time series data
5,000
Indicates the extensive use of dashboards for monitoring various metrics at Pinterest.
Expressions used in TScript
30,000
Demonstrates the widespread adoption of TScript across the organization.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Library
Pandas
Used to structure and handle time series data within TScript.
Library
Pyparsing
Used to parse the TScript language.
Library
Prophet
Integrated with TScript for anomaly detection in time series data.

Key Actionable Insights

1
Utilize TScript to streamline your time series data operations, making them more readable and maintainable.
By adopting TScript, engineers can reduce complexity in their queries, making it easier to perform operations like filtering and aggregating time series data without getting bogged down by syntax.
2
Leverage the built-in alerting features of TScript to integrate alerts directly into your dashboards.
This integration allows for a more cohesive monitoring experience, where alerts are visually represented alongside the relevant metrics, reducing the need for separate alert management systems.
3
Implement anomaly detection in your time series analysis using TScript in conjunction with Prophet.
This combination allows for sophisticated forecasting and anomaly detection, enabling teams to proactively address potential issues before they impact system performance.

Common Pitfalls

1
Failing to preallocate memory for DataFrames can lead to inefficient processing.
When dealing with unaligned data, creating individual DataFrames for each series can result in significant performance overhead. Preallocating memory with NaN values can streamline this process.
2
Overcomplicating time series queries with nested functions.
Using a function-based approach can make it difficult to read and maintain queries, leading to errors and inefficiencies. TScript's object-oriented design helps mitigate this issue.

Related Concepts

Time Series Analysis
Dataframe Manipulation
Anomaly Detection Techniques
Alerting Strategies In Observability