#
Cassandra Programming Tutorials & Engineering Articles
193 Cassandra tutorials, guides, and engineering insights from Netflix, Uber, Spotify, and more
Companies Using This
Cassandra Articles & Tutorials
Filter:
This article discusses how Netflix built a resilient data platform using a Write-Ahead Log (WAL) to address data consistency, reliability, and operational efficiency challenges at scale.
The article discusses how Palantir optimizes Elasticsearch to enhance its defensive capabilities against poor access patterns, particularly focusing on indexing refresh semantics.
Palantir
18 min read
Includes Code
Has Summary
--
This article discusses the development and implementation of forecasting models aimed at improving driver availability at airports, which are critical to Uber's ridesharing ecosystem.
Bob Zheng, Dhruv Ghulati, Manoj Panikkar, Michael (Yichuan) Cai
15 min read
Has Summary
--
The article discusses Uber's implementation of encryption at rest and disk isolation at scale using their Stateful Platform, Odin.
Ivan Shibitov, Johan Abildskov
14 min read
Has Summary
--
The article discusses the evolution of Netflix's Tudum architecture, transitioning from a CQRS model utilizing Kafka to a more efficient system based on RAW Hollow.
The article discusses Uber's robust database backup recovery system, highlighting its importance for business continuity and disaster recovery.
The article discusses Uber's migration of large-scale interactive compute workloads from Peloton to Kubernetes, focusing on minimizing disruption while enhancing resource management and cloud readi...
Sayan Pal, Rishabh Mishra
12 min read
Has Summary
--
The article discusses how Dash0 transitioned to using ClickHouse as a core database technology for their observability platform, leveraging its efficiency and scalability to handle OpenTelemetry da...
Miel Donkers
20 min read
Includes Code
Has Summary
--
Unlocking useful and valuable image generation with a natively multimodal model capable of precise, accurate, photorealistic outputs.
The article discusses Uber's transition to a multi-architecture environment by adopting Arm-based hosts at scale.
The article discusses The Accounter, a global coordination system developed by Uber to enhance operational throughput and safety on its stateful platform, Odin.
Jesper Borlum, Gianluca Mezzetti, Alexander Blazhenskikh
14 min read
Has Summary
--
This article discusses Netflix's Distributed Counter Abstraction, a service designed to enable distributed counting at scale while maintaining low latency performance.
The article discusses Uber's advanced settlement accounting system, which is crucial for managing financial transactions involving payment service providers (PSPs).
Onkar Singh, Sai Sameera Grandhi, Nagesh Kumar Mankala, Abhinav Agarwal
12 min read
Has Summary
--
Netflix's TimeSeries Data Abstraction Layer is designed to efficiently store and query vast amounts of temporal event data with low latency.
Netflix Technology Blog
22 min read
Includes Code
Has Summary
--
Netflix's Key-Value Data Abstraction Layer (KV DAL) enhances data access across its distributed databases, addressing challenges in consistency, durability, and performance.
The article discusses the Sparkle framework developed by Uber to standardize modular ETL processes, enhancing developer productivity and data quality.
Dinesh Jagannathan, Sharath Bhat, Suman Voleti, Praveen Raj
8 min read
Has Summary
--
Odin is Uber's stateful platform designed to manage various technologies for data storage efficiently.
Jesper Borlum, Gianluca Mezzetti
14 min read
Has Summary
--
This article discusses how Uber has implemented single-zone failure tolerance (SZFT) for its Apache Cassandra® database, ensuring high availability even in the event of a zone failure.
The article discusses Uber's journey in enhancing its Palette Meta Store, focusing on the challenges faced, the solutions implemented, and the resulting improvements in machine learning feature man...
Paarth Chothani, Nicholas Marcott, Dehua Lai, Xiyuan Feng, Chunhao Zhang, Victoria Wu
10 min read
Has Summary
--
The article discusses the evolution of Data Lifecycle Management (DLM) at Uber, detailing the journey from initial implementations to the development of a unified system.
Sumanth Srinivasa Krishnaswamy, Matt Mathew, Sonali Goyal
13 min read
Has Summary
--
The article discusses how Uber optimized its operations of the open-source Apache Cassandra database at scale, addressing various challenges and improvements made over time.
Cadence 1. 0 is a powerful open-source workflow orchestration platform designed for building and managing stateful services at scale.
This article discusses Spotify's transition to a declarative infrastructure model using Kubernetes, enabling efficient management of cloud resources across numerous services.
AnsibleApacheApache KafkaCassandraDockerElasticsearchGoogle CloudJSONKubernetesMemcachedPostgreSQLPuppetTerraformTypeScriptYAML
David Flemström
11 min read
Includes Code
Has Summary
--
The article discusses Netflix's development of a Media Understanding Platform that integrates machine learning capabilities into studio applications.
Netflix Technology Blog
14 min read
Has Summary
--
The article discusses Netflix's data ingestion pipeline, specifically focusing on the Annotation Operations concept that allows teams to create data pipelines for media annotations without concerni...
Netflix Technology Blog
8 min read
Includes Code
Has Summary
--
The article discusses Netflix's efforts to scale its media machine learning infrastructure, focusing on the challenges faced by media ML practitioners and the solutions developed to optimize and st...
Netflix Technology Blog
12 min read
Includes Code
Has Summary
--
The article discusses Marken, a scalable annotation service developed by Netflix to allow various microservices to annotate their entities with metadata.
Netflix Technology Blog
13 min read
Includes Code
Has Summary
--
The article discusses the development of a data reprocessing pipeline within Netflix's Asset Management Platform (AMP), designed to efficiently manage and update digital media assets' metadata.
Netflix Technology Blog
9 min read
Has Summary
--
The article discusses Uber's transition from a Server-Sent Events (SSE) architecture to a gRPC-based push platform, detailing the motivations, implementation challenges, and outcomes of this migrat...
The article discusses Data Mesh, a data movement and processing platform developed by Netflix, aimed at enhancing real-time data processing capabilities.
Netflix Technology Blog
8 min read
Has Summary
--
The article discusses Uber's implementation of an automated vertical CPU scaling system that optimizes resource allocation for storage workloads, leading to significant cost savings and improved re...
Lasse Vilhelmsen
10 min read
Has Summary
--
The article discusses strategies to avoid CPU throttling in a containerized environment, particularly at Uber, where stateful workloads run on a large fleet of hosts.
Joakim Recht, Yury Vostrikov
7 min read
Has Summary
--
The article discusses the Rapid Event Notification System (RENO) developed by Netflix to ensure real-time communication between backend systems and devices, enhancing the user experience for over 2...
Netflix Technology Blog
10 min read
Has Summary
--
This article discusses the implementation and usage of sstable-to-arrow, a tool designed to convert SSTable data from Cassandra into Arrow format for GPU-based analytics.
The article discusses the challenges and solutions in building scalable streaming pipelines for generating near real-time features at Uber.
Feng Xu, Gang Zhao
19 min read
Has Summary
--
This article discusses a novel approach to analyzing data stored in Apache Cassandra using GPU acceleration through the RAPIDS ecosystem.
Alex Cai
9 min read
Includes Code
Has Summary
--
This article discusses the importance of efficient memory layouts and memory pools in machine learning frameworks to enhance interoperability and performance.
Christian Hundt
9 min read
Includes Code
Has Summary
--
The article discusses Uber's comprehensive re-architecture of its Fulfillment Platform, aimed at enhancing its Go/Get strategy.
Ashwin Neerabail, Ankit Srivastava, Kamran Massoudi, Madan Thangavelu, Uday Kiran Medisetty
19 min read
Has Summary
--
The article discusses Uber's 'Orders Near You' feature, which utilizes real-time geospatial data analytics to enhance user experience in the Uber Eats app.
Yupeng Fu, Cassandra Tomazic, Dharak Kharod
10 min read
Has Summary
--
The article discusses the Elasticsearch indexing strategy implemented in Netflix's Asset Management Platform (AMP), focusing on how to efficiently manage and query large volumes of digital media as...
The article discusses the Netflix Data Explorer, an open-source tool designed to provide engineers with fast and safe access to data stored in Cassandra and Dynomite/Redis.
The article discusses the challenges posed by flaky unit tests in Java, particularly in the context of Continuous Integration (CI) systems.
Ravi Agarwal, Lazaro Clapp, Gautam Korlam, Murali Krishna Ramanathan, Vijay Subramanian
19 min read
Has Summary
--
The article discusses a specific bug related to undeclared dependencies in a software product, illustrating how this oversight was caught by a continuous delivery system named Apollo.
The article discusses the evolution of Uber's Schemaless datastore into a distributed SQL database called Docstore, highlighting its features, architecture, and motivation behind the transition.
The article discusses Uber's journey towards metric standardization through the development of uMetric, a unified internal metric platform.
Uber's Real-Time Push Platform focuses on enhancing user experiences by transitioning from polling to a gRPC-based bi-directional streaming protocol.
This article discusses Netflix's implementation of GraphQL Federation, detailing the core infrastructure, developer experience, schema governance, observability, security, and resilience strategies...
The article discusses Uber's development of uWorc, a no-code workflow orchestrator designed to simplify the creation of batch and streaming data pipelines.
This article discusses Slack's transition from MySQL to Vitess for scaling their datastore architecture.