Caching Programming Tutorials &amp; Engineering Articles

Making large Postgres migrations practical: 1TB in 2 hours with PeerDB

Intermediate

The article discusses how PeerDB facilitates large-scale PostgreSQL migrations, specifically achieving a 1TB migration in just 2 hours.

AWSAWS RDSCachingJSONPostgreSQL

15 min read

Includes Code

Has Summary

Scaling NVFP4 Inference for FLUX.2 on NVIDIA Blackwell Data Center GPUs

Advanced

The article discusses the collaboration between NVIDIA and Black Forest Labs to optimize the FLUX. 2 text-to-image model for NVIDIA Blackwell Data Center GPUs.

CachingEmbeddingMistral

Sandro Cavallari

8 min read

Includes Code

Has Summary

OpenAI

Advanced

Scaling PostgreSQL to power 800 million ChatGPT users

OpenAI details how they scaled PostgreSQL to support 800 million ChatGPT users, achieving millions of queries per second through a single-primary architecture with nearly 50 read replicas across mu...

AzureAzure Cosmos DBCachingKubernetesPostgreSQLSQL

Bohan Zhang

13 min read

Has Summary

Slack

Advanced

Build better software to build software better

Slack's build pipeline team reduced build times for Quip and Slack Canvas from 60 minutes to as little as 10 minutes by applying classic software engineering principles—separation of concerns, cach...

CachingChefCSSCythonJavaScriptJenkinsLessPythonRustTypeScript

David Reed

19 min read

Includes Code

Has Summary

100X Faster: How We Supercharged Netflix Maestro’s Workflow Engine

Advanced

Netflix redesigned Maestro's internal workflow engine, replacing the legacy Conductor 2. x-based stateless worker model with a custom stateful actor model built on Java 21 virtual threads.

CachingJavaYAML

Netflix Technology Blog

25 min read

Has Summary

Building a Resilient Data Platform with Write-Ahead Log at Netflix

Advanced

This article discusses how Netflix built a resilient data platform using a Write-Ahead Log (WAL) to address data consistency, reliability, and operational efficiency challenges at scale.

CachingCassandraElasticsearchEnvoyMemcachedSQL

Netflix Technology Blog

15 min read

Includes Code

Has Summary

Intermediate

Developer Experience at Pinterest: The Journey to PinConsole

The article discusses Pinterest's journey in enhancing developer experience through the creation of PinConsole, an Internal Developer Platform built on Backstage.

AWSAWS RDSCachingCDNGraphQLKubernetesOAuthPagerDutyPostgreSQLReact

Pinterest Engineering

15 min read

Has Summary

Netflix Tudum Architecture: from CQRS with Kafka to CQRS with RAW Hollow

Advanced

The article discusses the evolution of Netflix's Tudum architecture, transitioning from a CQRS model utilizing Kafka to a more efficient system based on RAW Hollow.

CachingCassandraCDNCQRSSQL

Netflix Technology Blog

8 min read

Has Summary

No more disks: the architecture behind stateless compute in ClickHouse Cloud

Intermediate

The article discusses the transition of ClickHouse Cloud to a fully stateless compute architecture, enabled by the introduction of a Shared Catalog.

AWSCachingSQL

Tom Schreiber

21 min read

Includes Code

Has Summary

Building a Distributed Cache for S3

Intermediate

This article discusses the development of a distributed cache for ClickHouse Cloud, aimed at providing low-latency access to hot data across compute nodes.

AWSAzureAzure Blob StorageCaching

Tom Schreiber

23 min read

Includes Code

Has Summary

Google

Beginner

Gemini 2.5 Models now support implicit caching

The article discusses the introduction of implicit caching support in Gemini 2. 5 models, enabling developers to benefit from significant cost savings without needing to create an explicit cache.

CachingGemini

Logan Kilpatrick

2 min read

Includes Code

Has Summary

Structuring Applications to Secure the KV Cache

Advanced

The article discusses the importance of structuring application prompts to enhance the security of key-value (KV) caching in large language model (LLM) applications.

CachingDeep LearningMachine Learning

Joseph Lucas

11 min read

Includes Code

Has Summary

Mobile Bridge: Making WebViews Feel Native

Beginner

The article discusses Mobile Bridge, a framework developed by Shopify to enhance WebViews in their mobile app, making them feel more native.

CachingJavaScriptModalReact

Mauricio de Meirelles

8 min read

Has Summary

Palantir

Intermediate

Requirements for AI in Production in Insurance Underwriting

The article discusses the requirements and best practices for deploying AI in production within the insurance underwriting sector.

Artificial IntelligenceCachingCSRFRetrieval Augmented GenerationXSS

Palantir

21 min read

Has Summary

Introducing New KV Cache Reuse Optimizations in NVIDIA TensorRT-LLM

Intermediate

The article introduces new KV cache reuse optimizations in NVIDIA TensorRT-LLM, focusing on improving memory management and throughput for large language models (LLMs).

Caching

John Thomson

7 min read

Includes Code

Has Summary

Data-Efficient Knowledge Distillation for Supervised Fine-Tuning with NVIDIA NeMo-Aligner

Intermediate

The article discusses the implementation of data-efficient knowledge distillation using NVIDIA NeMo-Aligner during supervised fine-tuning (SFT).

CachingLarge Language ModelsNeural Networks

Anna Shors

5 min read

Has Summary

Boost Llama 3.3 70B Inference Throughput 3x with NVIDIA TensorRT-LLM Speculative Decoding

Advanced

The article discusses how NVIDIA TensorRT-LLM enhances the inference throughput of Meta's Llama 3. 3 70B model by up to 3x through optimizations like speculative decoding and KV caching.

CachingHugging FacePython

Anjali Shah

8 min read

Includes Code

Has Summary

AWSAWS EC2AzureCachingJSONJuliaMongoDBShellSQL

Beginner

ClickHouse Release 24.10

ClickHouse version 24.

The ClickHouse Team

15 min read

Includes Code

Has Summary

Introducing Netflix’s TimeSeries Data Abstraction Layer

Intermediate

Netflix's TimeSeries Data Abstraction Layer is designed to efficiently store and query vast amounts of temporal event data with low latency.

ApacheCachingCassandraElasticsearch

Netflix Technology Blog

22 min read

Includes Code

Has Summary

Advanced

Structured DataStore (SDS): Multi-model Data Management With a Unified Serving Stack

The article discusses the Structured DataStore (SDS), a unified multi-model data management platform developed by Pinterest.

ApacheCachingRate LimitingSQLThriftYAML

Pinterest Engineering

18 min read

Includes Code

Has Summary

Preon: Presto Query Analysis for Intelligent and Efficient Analytics

Intermediate

The article discusses Preon, a microservice developed by Uber for intelligent and efficient query analysis using the Presto SQL engine.

ApacheCachingMicroservicesRedisSQL

Gurmeet Singh

13 min read

Has Summary

Intermediate

Feature Caching for Recommender Systems w/ Cachelib

The article discusses Pinterest's implementation of feature caching in their recommender systems using Cachelib, an in-process caching engine developed by Meta Open Source.

ApacheAWSCachingThrift

Pinterest Engineering

11 min read

Has Summary

Anthropic

Advanced

Introducing Contextual Retrieval

The article introduces Contextual Retrieval, a method that enhances Retrieval-Augmented Generation (RAG) by improving the retrieval step through Contextual Embeddings and Contextual BM25.

CachingClaudeEmbeddingGemini

11 min read

Includes Code

Has Summary

Meta

Intermediate

How Meta animates AI-generated images at scale

The article discusses how Meta has optimized the deployment of its AI-generated image animation feature to serve billions of users efficiently.

CachingDiffusion ModelsPyTorchU-Net

Gaurav Sharma

11 min read

Has Summary

Enabling Security for Hadoop Data Lake on Google Cloud Storage

Advanced

This article discusses Uber's migration of its Apache Hadoop-based data lake to Google Cloud Storage (GCS) and the security measures implemented during this transition.

ApacheApache SparkCachingCQRSGoogle CloudGoogle Cloud StoragegRPCMVPOAuthRedis

Matt Mathew, Alexander Gulko, Lei Sun, KK Sriramadhesikan, Alan Cao, Omkar Kakade

20 min read

Includes Code

Has Summary

Advanced

TiDB Adoption at Pinterest

The article discusses Pinterest's adoption of TiDB as a replacement for HBase, detailing the motivations, selection methodology, and the journey of integrating TiDB into their infrastructure.

ApacheAWSCachingEnvoyKubernetesMySQLSQLThrift

Pinterest Engineering

19 min read

Has Summary

AI Gateway is generally available: a unified interface for managing and scaling your generative AI workloads

Beginner

The article announces the General Availability of AI Gateway, a unified interface for managing and scaling generative AI workloads.

Caching

Kathy Liao

6 min read

Includes Code

Has Summary

Intermediate

HBase Deprecation at Pinterest

The article discusses Pinterest's transition from HBase, its first NoSQL datastore, to a new serving architecture with a unified storage service.

ApacheAWSAWS EC2CachingMySQL

Pinterest Engineering

7 min read

Has Summary

Google

Intermediate

Google I/O 2024 recap: Making AI accessible and helpful for every developer

The article recaps the Google I/O 2024 event, highlighting advancements in AI technologies aimed at making AI accessible for developers.

CachingDartFirebaseGeminiGenerative AIGoogle CloudJAXKerasKotlinOllamaPostgreSQLPyTorchTensorFlowWebAssembly

Jeanine Banks

8 min read

Has Summary

Notion

Intermediate

Notion on Android is now more than twice as fast to launch

Notion has significantly improved the launch speed of its Android app, making it more than twice as fast compared to the beginning of 2023.

CachingFirebaseJSONRenderSQLite

Karn Saheb

11 min read

Includes Code

Has Summary

Ensuring Precision and Integrity: A Deep Dive into Uber’s Accounting Data Testing Strategies

Intermediate

The article delves into Uber's comprehensive accounting data testing strategies, emphasizing the importance of precision and integrity in financial processes.

ApacheApache KafkaCaching

Onkar Singh, Harsha Aditya Ravuri, Viswanath Ramakkagari, Aditya Gopisetti, Hari Srinivasan

16 min read

Has Summary

How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache

Advanced

The article discusses how Uber serves over 40 million reads per second from its online storage using an integrated caching solution called CacheFront.

CachingJavaLuaMySQLOracleRedis

Preetham Narayanareddy, Eli Pozniansky, Zurab Kutsia, Afshin Salek, Piyush Patel

19 min read

Has Summary

Birthday Week recap: everything we announced — plus an AI-powered opportunity for startups

Advanced

Cloudflare celebrated its 13th birthday with a series of announcements aimed at enhancing its services for customers and the broader internet community.

AWSCachingCloudflare WorkersHugging FaceNode.jsVercel

Dina Kozlov

9 min read

Has Summary

How to Build a Distributed Inference Cache with NVIDIA Triton and Redis

Advanced

This article discusses how to build a distributed inference cache using NVIDIA Triton and Redis, highlighting the benefits and drawbacks of local versus distributed caching.

CachingDockerGoogle CloudRedis

Steve Lorello

12 min read

Includes Code

Has Summary

Cursor

Intermediate

Prompt Design

The article discusses the concept of 'prompt design' and draws parallels between prompting in AI and web design.

CachingGPTGPT-4HTMLJSXReact

Arvid

6 min read

Includes Code

Has Summary

Intermediate

Pacer: Pinterest’s New Generation of Asynchronous Computing Platform

The article discusses Pacer, Pinterest's new asynchronous computing platform designed to address the limitations of its predecessor, Pinlater.

CachingMySQLThrift

Pinterest Engineering

9 min read

Has Summary

Optimizing HDFS with DataNode Local Cache for High-Density HDD Adoption

Intermediate

This article discusses Uber's implementation of a local caching solution for HDFS DataNodes to optimize performance while adopting high-density HDDs.

ApacheCachingGrafanaJava

Chen Liang, Jing Zhao, Yangjun Zhang, Junyan Guo, Fengnan Li

19 min read

Has Summary

Announcing Cohort #2 of the Workers Launchpad

Intermediate

The article announces Cohort #2 of the Workers Launchpad, highlighting the success of the first cohort and introducing 25 new startups selected for the program.

AstroCachingCDNChakra UICloudflare WorkersGrafanaGraphQLReactRollupRustServerlessWebAssembly

Mia Wang

8 min read

Has Summary

Building Cloudflare on Cloudflare

Advanced

The article discusses how Cloudflare is transitioning its architecture to utilize Cloudflare Workers, aiming to enhance the performance, robustness, and developer experience of its products.

CachingCloudflare WorkersGolangHTMLJavaScriptLuaNGINXPHPPrometheusRocketRust

Richard Boulton

23 min read

Includes Code

Has Summary

Modernizing the toolbox for Cloudflare Pages builds

Advanced

The article discusses the modernization of the build system for Cloudflare Pages, introducing a new beta version that supports updated tools and languages, including Node. js, Python, and Ruby.

CachingKubernetesNode.jsRuby

Greg Brimble

8 min read

Includes Code

Has Summary

Speeding up LZ4 in ClickHouse

Intermediate

This article discusses the optimization of LZ4 decompression in ClickHouse, highlighting the challenges and solutions to improve performance.

Caching

Alexey Milovidov

37 min read

Includes Code

Has Summary

Advanced

Reducing Apache Spark Application Dependencies Upload by 99%

The article discusses how LinkedIn reduced the upload of Apache Spark application dependencies by 99% through the implementation of a user-level caching mechanism.

ApacheApache SparkCaching

LinkedIn Engineering Team

10 min read

Has Summary

Introducing the ClickHouse Query Cache

Intermediate

The article introduces the ClickHouse Query Cache, a new feature designed to enhance performance by caching the results of expensive SELECT queries.

ApacheCachingGrafanaMySQLOracle

Robert Schulze

10 min read

Includes Code

Has Summary

Beginner

ClickHouse Release 23.1

ClickHouse Release 23. 1 introduces significant enhancements including 17 new features, 17 performance optimizations, and 78 bug fixes.

ApacheCachingGrafanaSQL

The ClickHouse Team

9 min read

Includes Code

Has Summary

Caching Without Marshal Part 2: The Path to MessagePack

Advanced

This article discusses the transition from Ruby's Marshal serialization to MessagePack for caching in Rails applications.

ActiveRecordCachingJavaMessagePackRailsRuby

Chris Salzberg

19 min read

Includes Code

Has Summary

Caching Without Marshal Part 1: Marshal from the Inside Out

Advanced

The article discusses the critical role of caching in Rails applications and the inherent risks associated with using Ruby's Marshal for serialization.

ActiveRecordCachingMemcachedMessagePackMonolithRailsRedisRuby

Chris Salzberg

12 min read

Includes Code

Has Summary

Speed Up Presto at Uber with Alluxio Local Cache

Intermediate

The article discusses Uber's implementation of Alluxio local caching to enhance the performance of Presto, a data analytics engine.

CachingGrafanaJSON

Chen Liang, Beinan Wang

12 min read

Has Summary

Cloudflare Workers and micro-frontends: made for one another

Advanced

This article discusses the integration of Cloudflare Workers with micro-frontends, presenting a fragments architecture that enhances web application performance and scalability.

AngularCachingCloudflare WorkersHTMLJavaScriptMicroservicesNode.jsQwikRailsReactRuby

Peter Bacon Darwin

14 min read

Includes Code

Has Summary

uBuild: Fast and Safe Building of Thousands of Container Images

Advanced

The article discusses uBuild, Uber's platform for building container images efficiently and securely.

CachingDockerGitJavaJavaScriptYAML

Rasmus Vestergaard, Andreas Lykke

12 min read

Has Summary