Automating Data Protection at Scale, Part 1

Part one of a series on how we provide powerful, automated, and scalable data privacy and security engineering capabilities at Airbnb.

elizabeth nammour
13 min readintermediate
--
View Original

Overview

This article discusses the development of Airbnb's Data Protection Platform (DPP), which automates data privacy and security measures at scale. It outlines the challenges faced in data protection, the architecture of the DPP, and the role of the Madoka metadata system in managing data security and privacy.

What You'll Learn

1

How to build a centralized inventory system for data assets

2

Why automated data classification is crucial for compliance

3

How to implement a data encryption service for sensitive information

Prerequisites & Requirements

  • Understanding of data privacy laws like GDPR and CCPA
  • Familiarity with AWS services and Terraform(optional)

Key Questions Answered

What is the purpose of the Data Protection Platform at Airbnb?
The Data Protection Platform (DPP) at Airbnb was created to automate data protection processes, ensuring compliance with global regulations and security requirements. It addresses challenges in tracking user and sensitive data flows across various data stores, enabling effective monitoring and protection.
How does Madoka contribute to data protection at Airbnb?
Madoka is a metadata system that collects and manages security and privacy-related metadata for all data assets at Airbnb. It provides a centralized repository for tracking data ownership, classification, and compliance, which is essential for automating data protection actions.
What are the key components of the Data Protection Platform?
The DPP includes several components such as Inspekt for data classification, Angmar for secret detection, Cipher for data encryption, Obliviate for privacy compliance requests, and Madoka for metadata management. Together, these services enable comprehensive data protection.
What challenges does Airbnb face in data protection?
Airbnb faces challenges in monitoring and protecting data due to the vast amount of data collected across multiple stores and infrastructures. Manual tracking is insufficient, and existing tools do not meet all requirements for data discovery and automated protection.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement a centralized inventory system to track data assets effectively.
This system is crucial for understanding the security and privacy risks associated with data assets, enabling better compliance with regulations like GDPR and CCPA.
2
Automate data classification to ensure continuous monitoring of sensitive data.
Using tools like Inspekt can help maintain accurate classifications, reducing the risk of non-compliance and data breaches.
3
Utilize encryption services to protect sensitive information from unauthorized access.
Implementing a robust encryption strategy is essential for safeguarding data, especially in the event of a security breach.

Common Pitfalls

1
Relying solely on manual data classifications can lead to inaccuracies.
Data owners may forget to update classifications when data changes, resulting in compliance risks. Automating this process can mitigate such issues.

Related Concepts

Data Privacy Laws
Data Classification Techniques
Data Encryption Methods