How Pinterest Uses SQL
33 engineering articles about SQL from Pinterest's engineering team
Other Pinterest Technologies
Other Companies Using SQL
Articles
Filter:
The article reflects on a decade of AI platform development at Pinterest, detailing the evolution from fragmented machine learning stacks to a unified AI platform that supports various models.
AutoMLDockerEmbeddingGenerative AIJavaKubernetesLightGBMPySparkPythonPyTorchSeedSQLTensorFlowThriftTransformer
Pinterest Engineering
22 min read
Has Summary
--
This article discusses Pinterest's transition from a Hadoop-based platform to a Kubernetes-based data processing solution named Moka.
The article discusses the challenges of understanding metric movements and outlines three approaches used at Pinterest for root-cause analysis (RCA).
Pinterest Engineering
9 min read
Has Summary
--
The article discusses the Structured DataStore (SDS), a unified multi-model data management platform developed by Pinterest.
The article discusses Pinterest's migration from Druid to StarRocks for delivering faster analytics.
The article discusses Pinterest's adoption of TiDB as a replacement for HBase, detailing the motivations, selection methodology, and the journey of integrating TiDB into their infrastructure.
The article discusses Pinterest's development of a Text-to-SQL feature that utilizes Large Language Models (LLMs) to assist data users in generating SQL queries from natural language questions.
Pinterest Engineering
9 min read
Has Summary
--
This article discusses the improvements made to the Goku time series database at Pinterest, focusing on enhancing query efficiency through features like rollup, pre-aggregation, and pagination.
This article discusses the development of an end-to-end JSON logging system for client applications at Pinterest, highlighting the challenges faced with existing logging methods and the design deci...
Pinterest Engineering
5 min read
Has Summary
--
The article discusses how Pinterest developed its Trust & Safety team to combat spam and unsafe content.
This article is the second part of a series discussing Pinterest Analytics as a platform on Druid, focusing on lessons learned regarding optimization for batch use cases.
This article discusses Pinterest's Batch Processing Platform, Monarch, focusing on efficient resource management to ensure quality of service (QoS) while maintaining cost efficiency.
The article discusses how Pinterest combats spam through clustering and automated rule creation, emphasizing the importance of quickly identifying and mitigating spam attacks to protect user safety.
Pinterest Engineering
8 min read
Has Summary
--
This article is the second part of a series discussing Pinterest's use of Druid for analytics.
The article discusses a novel approach to improving S3 read throughput, resulting in significant efficiency gains for production jobs.
Pinterest Engineering
6 min read
Has Summary
--
The article discusses Pinterest's approach to combating spam through clustering and automated rule creation.
Pinterest Engineering
8 min read
Has Summary
--
The article discusses how Pinterest utilizes Apache Spark SQL for interactive querying, detailing the architecture, challenges faced, and solutions implemented to enhance user experience.
Pinterest Engineering
18 min read
Has Summary
--
The article discusses how Pinterest improved data processing efficiency by implementing partial deserialization of Thrift encoded data.
The article discusses the integration of Snowflake with Tableau, focusing on setting up secondary roles to enhance role-based access control (RBAC) for data visualization.
The article discusses how Pinterest employs machine learning to combat spam and harmful content on its platform.
Pinterest Engineering
5 min read
Has Summary
--
The article discusses how Pinterest employs machine learning to combat misinformation, hate speech, and self-harm content on its platform.
Pinterest Engineering
7 min read
Has Summary
--
The article discusses Guardian, a real-time analytics and rules engine developed by Pinterest's Trust & Safety team to combat spam.
The article discusses the improvements made to Pinlater, an asynchronous job execution system, particularly focusing on the transition from Redis to MySQL/InnoDB as the backend data store.
The article discusses how Pinterest empowered its data scientists and machine learning engineers by building a PySpark infrastructure that addresses challenges faced with existing tools like Hive a...
Pinterest Engineering
7 min read
Has Summary
--
The article discusses Pinterest's transition from using Apache HBase to Apache Druid for ads analytics, highlighting the challenges faced and the benefits of Druid's capabilities in handling comple...
The article discusses the concept of 'Pinterest Paths', which describes the exploration behavior of users on Pinterest as they navigate through related ideas.
The article discusses the enhancements made to a Spark pipeline for conversion attribution at Pinterest, focusing on scalability as the number of users and advertisers grows.
The article discusses Pinterest's implementation of Presto, an open-source distributed SQL query engine, detailing the challenges faced and solutions developed to manage large-scale data analysis.
The article discusses the development of Aperture, a real-time user action counting system for ads at Pinterest.
Pinterest Engineering
9 min read
Includes Code
Has Summary
--
This article discusses the implementation of a feature that allows users to reorder Pins on Pinterest boards, addressing the technical challenges involved in scaling the backend service.
Skyline is an ETL-as-a-Service platform developed by Pinterest to streamline data processing and reporting for its users.
The article discusses Pinterest's implementation of a real-time data pipeline for analytics, leveraging technologies like Spark Streaming and MemSQL.
Pinterest Engineering
4 min read
Has Summary
--
The article discusses how Pinterest manages its big data infrastructure, detailing the evolution from a single cluster Hadoop setup to a self-serving platform that supports extensive data processin...
You've reached the end! All 33 articles loaded.