Meta logo

How Meta Uses SQL

52 engineering articles about SQL from Meta's engineering team

Articles

Filter:
Meta logo
Meta
Intermediate
The article discusses how Meta is scaling its Privacy Aware Infrastructure (PAI) to address privacy challenges in the era of Generative AI (GenAI) product innovation.
Rituraj Kirti
11 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Meta's implementation of Policy Zones within its Privacy Aware Infrastructure (PAI) to enforce purpose limitations on data in large-scale batch processing systems.
Lucas Waye
24 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses how Meta implements data lineage as part of its Privacy Aware Infrastructure (PAI) initiative to enhance user privacy through scalable data flow discovery.
Rishab Mangla
16 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Strobelight, Meta's profiling orchestrator that integrates multiple open-source technologies to enhance efficiency and resource utilization across its server fleet.
Jordan Rome
15 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Glean, Meta's open-source code indexing system designed to efficiently collect and manage information about source code.
Meta logo
Meta
Intermediate
The article discusses Meta's Privacy Aware Infrastructure (PAI) initiative, which integrates advanced privacy constructs into its software systems to enforce purpose limitation effectively.
Wenlong Dong
17 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the implementation of serverless Jupyter notebooks at Meta, leveraging the Bento platform and Pyodide to enable in-browser code execution for lite workloads.
Steve Dini
6 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses Meta's transition to a composable data management architecture, emphasizing interoperability, reusability, and engineering efficiency.
Pedro Pedreira
11 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the implementation of scheduled Jupyter notebooks at Meta, focusing on the integration of Bento with the Dataswarm batch ETL pipeline framework.
Steve Dini
6 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses the implementation and deployment of MySQL Raft at Meta, focusing on how it aims to replace semisynchronous databases with a more reliable and simpler distributed system.
Anirban Rahut
21 min read
Has Summary
--
Meta logo
Meta
Intermediate
Meta has introduced Velox, an open source unified execution engine designed to enhance data management systems and streamline their development.
Pedro Pedreira
10 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses the development and implementation of UPM (Unified Programming Model) at Meta, which enables static analysis of SQL queries.
Daniel Ohayon
7 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses SQL Notebooks, a tool developed at Meta that combines the functionalities of SQL IDEs and Jupyter Notebooks to enhance data analytics.
Guilherme Kunigami
8 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses Facebook's approach to autonomous testing of back-end services at scale, highlighting the challenges of maintaining a stable infrastructure for over 3 billion users.
Paul Marinescu
15 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses Mariana Trench (MT), a tool developed by Facebook for identifying security and privacy vulnerabilities in Android and Java applications.
Dominik Gabi
8 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
CG/SQL is a code generation system designed for SQLite that enables developers to write stored procedures in a variant of Transact-SQL (T-SQL) and compile them into C code.
Rico Mariani
4 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
Pysa is an open-source static analysis tool developed by Facebook to detect and prevent security issues in Python code.
Graham Bleaney
12 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The 2019 @Scale Conference brought together over 1,300 engineers to discuss challenges and innovations in building scalable applications and services.
Meta logo
Meta
Advanced
The article provides a recap of the Systems @Scale 2019 conference, highlighting discussions on engineering challenges faced by operating systems serving millions of users.
9 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Aria, a set of initiatives aimed at enhancing PrestoDB efficiency, particularly focusing on optimizing table scans for Hive queries on data stored in ORC format.
Maria Basmanova
6 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
Delos is a new architecture for building replicated storage systems at Facebook, designed to provide flexibility and simplicity without compromising performance or reliability.
Mahesh Balakrishnan
8 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the implementation of the HyperLogLog (HLL) algorithm in Presto, a distributed SQL query engine, to improve the efficiency of cardinality estimation in large data sets.
Mehrdad Honarkhah
16 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The 2018 @Scale Conference brought together over 2,500 engineers to explore the challenges and innovations in building scalable applications and services.
Meta logo
Meta
Intermediate
The article discusses Spiral, a self-tuning system developed by Facebook that utilizes real-time machine learning to optimize high-performance infrastructure services.
Vladimir Bychkovsky
10 min read
Has Summary
--
Meta logo
Meta
Advanced
The Data @Scale 2017 conference brought together 350 engineers to discuss the challenges and innovations in large-scale storage systems and analytics.
Parixit Pol
4 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the rebuilding of OnlineSchemaChange, a tool initially developed in PHP for performing MySQL schema changes with minimal downtime, into a more flexible and user-friendly versi...
4 min read
Has Summary
--
Meta logo
Meta
Advanced
Faiss is a library developed by Facebook for efficient similarity search in large-scale multimedia datasets.
Hervé Jegou
14 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses how Facebook utilized Apache Spark for large-scale language model training, highlighting the transition from a Hive-based solution to a Spark-based pipeline.
Tejas Patil
13 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
This article provides a comprehensive comparison of two state-of-the-art graph processing systems, Apache Giraph and GraphX, focusing on their performance, scalability, and usability for large-scal...
Maja Kabiljo
20 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
MyRocks is an open-source MySQL storage engine developed by Facebook, integrating RocksDB to optimize space and write efficiency.
Yoshinori Matsunobu
11 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Facebook's experience in scaling Apache Spark to handle a 60 TB+ production use case, focusing on the migration from a Hive-based pipeline to a more efficient Spark implementa...
Avery Ching
15 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The article recaps the Data @Scale conference held in June 2016, focusing on large-scale storage systems and analytics.
Surendra Verma
7 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses Facebook's Safety Check feature, which allows users to notify friends and family of their safety during crises.
Peter Cottle
10 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses the evolution and community building around Presto, Facebook's distributed SQL engine designed for fast interactive analytics.
Jay Tang
4 min read
Has Summary
--
Meta logo
Meta
Intermediate
GraphQL is a powerful data query language developed by Facebook to simplify data-fetching for mobile applications.
Lee Byron
6 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses the Data@Scale 2015 event, where engineers from various leading technology companies gathered to address the challenges of scaling data storage and processing.
Ganapathy Krishnamoorthy
5 min read
Has Summary
--
Meta logo
Meta
Advanced
This article discusses the integration of RocksDB as an embedded database within osquery, an open-source operating system instrumentation framework.
Ted Reed
21 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses enhancements made to the Presto ORC reader, focusing on performance improvements through features like columnar reads, predicate pushdown, and lazy reads.
Dain Sundstrom
10 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The article discusses a complex metastable failure state encountered in Facebook's systems due to link imbalance in network traffic.
Nathan Bronson
10 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article provides a recap of the Security @Scale 2014 conference, highlighting discussions on scalable security solutions from various companies including Facebook, Twitter, and GitHub.
Fernanda Weiden
7 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article introduces osquery, a framework developed by Facebook that transforms operating systems into high-performance relational databases, allowing users to execute SQL queries for real-time s...
6 min read
Includes Code
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the deployment of Global Transaction ID (GTID) in MySQL 5. 6 at Facebook, highlighting its benefits for failover, backup recovery, and replication.
Evan Elias
9 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses Facebook's commitment to open source in 2013, highlighting significant projects and contributions across various domains such as mobile, web, data, and infrastructure.
Meta logo
Meta
Intermediate
The article discusses Presto, a distributed SQL query engine developed by Facebook to enable interactive analysis of large datasets stored in their data warehouse.
Martin Traverso
6 min read
Has Summary
--
Meta logo
Meta
Advanced
The article discusses TAO, Facebook's distributed data store designed to efficiently manage the social graph's complex data relationships.
Mark Marchukov
12 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses the integration of research into Facebook's engineering culture, emphasizing the importance of risk-taking, rapid prototyping, and iterative development.
Ralf Herbrich
5 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article features Haiping Zhao, a senior software engineer at Facebook, discussing his work on scalability and distributed SQL databases.
Haiping Zhao
5 min read
Has Summary
--
Meta logo
Meta
Beginner
The article discusses enhancements to Facebook's database backup system, transitioning from logical backups using mysqldump to a custom physical backup model with XtraBackup.
Nagavamsi Ponnekanti
5 min read
Has Summary
--
Meta logo
Meta
Intermediate
The article discusses join optimization techniques in Apache Hive, focusing on improving performance for join operations, which are critical for processing large datasets.
Liyin Tang
8 min read
Includes Code
Has Summary
--
Meta logo
Meta
Advanced
The article discusses the concept of full-stack programming, emphasizing the importance of understanding the various layers of a system for performance and optimization.
14 min read
Includes Code
Has Summary
--