Achieving Insights and Savings with Cost Data

The path to cloud efficiency begins with a cost data foundation

Anna Matlin
11 min readintermediate
--
View Original

Overview

The article discusses how Airbnb developed a robust cost data foundation to enhance infrastructure efficiency and achieve significant cost savings. It details the architectural design of their cost data pipeline, the metrics used for monitoring costs, and the cultural shift towards cost efficiency within the organization.

What You'll Learn

1

How to build a cost data pipeline using AWS Cost & Usage Report

2

Why defining meaningful metrics is crucial for cost efficiency

3

When to apply custom logic for discounting and amortization in cost data

Prerequisites & Requirements

  • Understanding of cloud cost management and AWS services
  • Familiarity with Apache Airflow and data warehousing concepts(optional)

Key Questions Answered

How did Airbnb build its cost data foundation?
Airbnb built its cost data foundation by developing a pipeline on top of the AWS Cost & Usage Report (CUR), which transformed raw data into actionable insights. This pipeline included steps for ingesting data, applying discounts, amortizing costs, and enriching data for analytics, allowing teams to make informed decisions about cost efficiency.
What metrics does Airbnb use to monitor cost efficiency?
Airbnb primarily uses metrics such as total cost, cost per booking, and product-specific usage metrics like vCPU-Hours and GB/Month for S3 storage. These metrics help in understanding the financial impact of AWS costs and monitoring resource usage trends.
What are some tips for designing a successful cost data pipeline?
Key tips include designing with downstream use cases in mind, building for retroactive adjustments, and studying options for obtaining raw data. This ensures that the pipeline is robust, accurate, and meets the needs of various stakeholders in the organization.
What common pitfalls should be avoided in cost data management?
Common pitfalls include neglecting to regularly review cost data and failing to convert findings into financial terms. Regular reviews help catch anomalies early, while translating data into dollar impacts ensures that stakeholders understand the urgency of cost management.

Key Statistics & Figures

Most viewed dashboard at Airbnb
AWS costs dashboard
As of early 2021, this dashboard was the most accessed by teams at Airbnb.
Reduction in dimensions for usability
Reduced from ~200 to ~30
This simplification made downstream tables more user-friendly and actionable.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend
AWS Cost & Usage Report
Used as the foundational data source for building the cost data pipeline.
Tools
Apache Airflow
Utilized for orchestrating the daily pipeline that processes cost data.
Tools
Apache Hive
Part of the data warehousing tools used for analytics.
Tools
Apache Spark
Employed for processing large datasets within the data pipeline.
Tools
Apache Druid
Used for real-time analytics and data visualization.
Tools
Apache Superset
Used for creating visualizations of cost data.

Key Actionable Insights

1
Establish a clear process for ingesting and transforming cost data to ensure accuracy and usability.
A well-defined pipeline allows teams to quickly access and analyze cost data, leading to better decision-making and cost management.
2
Regularly review cost metrics and trends to identify anomalies and opportunities for savings.
Frequent monitoring helps teams react swiftly to unexpected cost spikes and adjust strategies accordingly.
3
Involve stakeholders early in the dashboard design process to ensure the final product meets their needs.
Collaborating with stakeholders leads to more relevant insights and fosters a culture of cost awareness across the organization.

Common Pitfalls

1
Failing to regularly review cost data can lead to missed opportunities for savings.
Without consistent monitoring, teams may overlook spikes in costs or trends that indicate inefficiencies, making it harder to implement timely corrective actions.
2
Neglecting to translate cost findings into financial terms can reduce stakeholder engagement.
When cost data is presented without financial implications, it may not resonate with decision-makers, leading to inaction on critical cost management issues.

Related Concepts

Cloud Cost Management
AWS Services
Data Pipeline Architecture
Cost Efficiency Strategies