#

Cassandra Programming Tutorials & Engineering Articles

193 Cassandra tutorials, guides, and engineering insights from Netflix, Uber, Spotify, and more

Cassandra Articles & Tutorials

Filter:
Netflix logo
Netflix
Advanced
This article discusses how Netflix built a resilient data platform using a Write-Ahead Log (WAL) to address data consistency, reliability, and operational efficiency challenges at scale.
Netflix Technology Blog
15 min read
Includes Code
Has Summary
--
Palantir logo
Palantir
Advanced
The article discusses how Palantir optimizes Elasticsearch to enhance its defensive capabilities against poor access patterns, particularly focusing on indexing refresh semantics.
Palantir
18 min read
Includes Code
Has Summary
--
Uber logo
Uber
Advanced
This article discusses the development and implementation of forecasting models aimed at improving driver availability at airports, which are critical to Uber's ridesharing ecosystem.
Bob Zheng, Dhruv Ghulati, Manoj Panikkar, Michael (Yichuan) Cai
15 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's implementation of encryption at rest and disk isolation at scale using their Stateful Platform, Odin.
Ivan Shibitov, Johan Abildskov
14 min read
Has Summary
--
Netflix logo
Netflix
Advanced
The article discusses the evolution of Netflix's Tudum architecture, transitioning from a CQRS model utilizing Kafka to a more efficient system based on RAW Hollow.
Netflix Technology Blog
8 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's robust database backup recovery system, highlighting its importance for business continuity and disaster recovery.
Arjav Jain, Shivam Vijay, Debadarsini Nayak, Mohammed Khatib, Ramnik Jain
11 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses Uber's migration of large-scale interactive compute workloads from Peloton to Kubernetes, focusing on minimizing disruption while enhancing resource management and cloud readi...
Sayan Pal, Rishabh Mishra
12 min read
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article discusses how Dash0 transitioned to using ClickHouse as a core database technology for their observability platform, leveraging its efficiency and scalability to handle OpenTelemetry da...
Miel Donkers
20 min read
Includes Code
Has Summary
--
OpenAI logo
OpenAI
Advanced
Unlocking useful and valuable image generation with a natively multimodal model capable of precise, accurate, photorealistic outputs.
OpenAI
12 min read
--
Uber logo
Uber
Advanced
The article discusses Uber's transition to a multi-architecture environment by adopting Arm-based hosts at scale.
Andreas Lykke, Jesper Borlum
10 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses The Accounter, a global coordination system developed by Uber to enhance operational throughput and safety on its stateful platform, Odin.
Jesper Borlum, Gianluca Mezzetti, Alexander Blazhenskikh
14 min read
Has Summary
--
Netflix logo
Netflix
Advanced
This article discusses Netflix's Distributed Counter Abstraction, a service designed to enable distributed counting at scale while maintaining low latency performance.
Netflix Technology Blog
22 min read
Includes Code
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's advanced settlement accounting system, which is crucial for managing financial transactions involving payment service providers (PSPs).
Onkar Singh, Sai Sameera Grandhi, Nagesh Kumar Mankala, Abhinav Agarwal
12 min read
Has Summary
--
Netflix logo
Netflix
Intermediate
Netflix's TimeSeries Data Abstraction Layer is designed to efficiently store and query vast amounts of temporal event data with low latency.
Netflix Technology Blog
22 min read
Includes Code
Has Summary
--
Netflix logo
Netflix
Advanced
Netflix's Key-Value Data Abstraction Layer (KV DAL) enhances data access across its distributed databases, addressing challenges in consistency, durability, and performance.
Netflix Technology Blog
16 min read
Includes Code
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses the Sparkle framework developed by Uber to standardize modular ETL processes, enhancing developer productivity and data quality.
Dinesh Jagannathan, Sharath Bhat, Suman Voleti, Praveen Raj
8 min read
Has Summary
--
Uber logo
Uber
Advanced
Odin is Uber's stateful platform designed to manage various technologies for data storage efficiently.
Jesper Borlum, Gianluca Mezzetti
14 min read
Has Summary
--
Uber logo
Uber
Advanced
This article discusses how Uber has implemented single-zone failure tolerance (SZFT) for its Apache Cassandra® database, ensuring high availability even in the event of a zone failure.
Long Pan, Gopal Mor, Jaydeepkumar Chovatia, Shriniket Kale, Gabriele Di Bernardo
12 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's journey in enhancing its Palette Meta Store, focusing on the challenges faced, the solutions implemented, and the resulting improvements in machine learning feature man...
Paarth Chothani, Nicholas Marcott, Dehua Lai, Xiyuan Feng, Chunhao Zhang, Victoria Wu
10 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses the evolution of Data Lifecycle Management (DLM) at Uber, detailing the journey from initial implementations to the development of a unified system.
Sumanth Srinivasa Krishnaswamy, Matt Mathew, Sonali Goyal
13 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses how Uber optimized its operations of the open-source Apache Cassandra database at scale, addressing various challenges and improvements made over time.
Jaydeepkumar Chovatia, Gopal Mor, Runtian Liu
11 min read
Has Summary
--
Uber logo
Uber
Advanced
Cadence 1. 0 is a powerful open-source workflow orchestration platform designed for building and managing stateful services at scale.
Ender Demirkaya
10 min read
Has Summary
--
Spotify logo
Spotify
Advanced
This article discusses Spotify's transition to a declarative infrastructure model using Kubernetes, enabling efficient management of cloud resources across numerous services.
Netflix logo
Netflix
Intermediate
The article discusses Netflix's development of a Media Understanding Platform that integrates machine learning capabilities into studio applications.
Netflix Technology Blog
14 min read
Has Summary
--
Netflix logo
Netflix
Advanced
The article discusses Netflix's data ingestion pipeline, specifically focusing on the Annotation Operations concept that allows teams to create data pipelines for media annotations without concerni...
Netflix Technology Blog
8 min read
Includes Code
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses Netflix's efforts to scale its media machine learning infrastructure, focusing on the challenges faced by media ML practitioners and the solutions developed to optimize and st...
Netflix Technology Blog
12 min read
Includes Code
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses Marken, a scalable annotation service developed by Netflix to allow various microservices to annotate their entities with metadata.
Netflix Technology Blog
13 min read
Includes Code
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses the development of a data reprocessing pipeline within Netflix's Asset Management Platform (AMP), designed to efficiently manage and update digital media assets' metadata.
Netflix Technology Blog
9 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's transition from a Server-Sent Events (SSE) architecture to a gRPC-based push platform, detailing the motivations, implementation challenges, and outcomes of this migrat...
Anirudh Raja, Shahbaz Kaladiya, Shivani Bhatia, Xinlin Peng
19 min read
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses Data Mesh, a data movement and processing platform developed by Netflix, aimed at enhancing real-time data processing capabilities.
Netflix Technology Blog
8 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses Uber's implementation of an automated vertical CPU scaling system that optimizes resource allocation for storage workloads, leading to significant cost savings and improved re...
Lasse Vilhelmsen
10 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses strategies to avoid CPU throttling in a containerized environment, particularly at Uber, where stateful workloads run on a large fleet of hosts.
Joakim Recht, Yury Vostrikov
7 min read
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses the Rapid Event Notification System (RENO) developed by Netflix to ensure real-time communication between backend systems and devices, enhancing the user experience for over 2...
Netflix Technology Blog
10 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article discusses the implementation and usage of sstable-to-arrow, a tool designed to convert SSTable data from Cassandra into Arrow format for GPU-based analytics.
Alex Cai
5 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses the challenges and solutions in building scalable streaming pipelines for generating near real-time features at Uber.
Feng Xu, Gang Zhao
19 min read
Has Summary
--
NVIDIA logo
NVIDIA
Intermediate
This article discusses a novel approach to analyzing data stored in Apache Cassandra using GPU acceleration through the RAPIDS ecosystem.
NVIDIA logo
NVIDIA
Intermediate
This article discusses the importance of efficient memory layouts and memory pools in machine learning frameworks to enhance interoperability and performance.
Uber logo
Uber
Intermediate
The article discusses Uber's comprehensive re-architecture of its Fulfillment Platform, aimed at enhancing its Go/Get strategy.
Ashwin Neerabail, Ankit Srivastava, Kamran Massoudi, Madan Thangavelu, Uday Kiran Medisetty
19 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses Uber's 'Orders Near You' feature, which utilizes real-time geospatial data analytics to enhance user experience in the Uber Eats app.
Yupeng Fu, Cassandra Tomazic, Dharak Kharod
10 min read
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses the Elasticsearch indexing strategy implemented in Netflix's Asset Management Platform (AMP), focusing on how to efficiently manage and query large volumes of digital media as...
Netflix Technology Blog
12 min read
Includes Code
Has Summary
--
Netflix logo
Netflix
Intermediate
The article discusses the Netflix Data Explorer, an open-source tool designed to provide engineers with fast and safe access to data stored in Cassandra and Dynomite/Redis.
Netflix Technology Blog
8 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses the challenges posed by flaky unit tests in Java, particularly in the context of Continuous Integration (CI) systems.
Ravi Agarwal, Lazaro Clapp, Gautam Korlam, Murali Krishna Ramanathan, Vijay Subramanian
19 min read
Has Summary
--
Palantir logo
Palantir
Intermediate
The article discusses a specific bug related to undeclared dependencies in a software product, illustrating how this oversight was caught by a continuous delivery system named Apollo.
Robert Fink
9 min read
Includes Code
Has Summary
--
Uber logo
Uber
Advanced
The article discusses the evolution of Uber's Schemaless datastore into a distributed SQL database called Docstore, highlighting its features, architecture, and motivation behind the transition.
Ovais Tariq, Deba Chatterjee, Himank Chaudhary
9 min read
Has Summary
--
Uber logo
Uber
Advanced
The article discusses Uber's journey towards metric standardization through the development of uMetric, a unified internal metric platform.
Xiaodong Wang, Wenrui Meng, Will Yu, Yun Wu
13 min read
Has Summary
--
Uber logo
Uber
Advanced
Uber's Real-Time Push Platform focuses on enhancing user experiences by transitioning from polling to a gRPC-based bi-directional streaming protocol.
Uday Kiran Medisetty, Nilesh Mahajan, Anirudh Raja, Madan Thangavelu
19 min read
Has Summary
--
Netflix logo
Netflix
Advanced
This article discusses Netflix's implementation of GraphQL Federation, detailing the core infrastructure, developer experience, schema governance, observability, security, and resilience strategies...
Netflix Technology Blog
13 min read
Has Summary
--
Uber logo
Uber
Intermediate
The article discusses Uber's development of uWorc, a no-code workflow orchestrator designed to simplify the creation of batch and streaming data pipelines.
Sandeep Karmakar, Sriharsha Chintalapani
11 min read
Has Summary
--
Slack logo
Slack
Advanced
This article discusses Slack's transition from MySQL to Vitess for scaling their datastore architecture.
Uber logo
Uber
Advanced
The article discusses Uber's Databook, an in-house platform designed to manage and surface metadata related to various data entities.
Sunheng Taing, Atul Gupte
25 min read
Has Summary
--