ClickHouse logo

How ClickHouse Uses Python

66 engineering articles about Python from ClickHouse's engineering team

Articles

Filter:
ClickHouse logo
ClickHouse
Advanced
This article details how ClickPy, a free Python download statistics platform powered by ClickHouse, scaled to over 2 trillion rows by replacing its legacy cron-based ingestion pipeline with ClickPi...
8 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article discusses the development of chDB, a Python library that integrates ClickHouse with Pandas DataFrames for high-performance SQL querying.
Xiaozhe Yu Auxten Wang
10 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
At ClickHouse, we don't like the word "impossible." We believe that with the right tools, everything is a data problem. To prove it, we decided to complete the 2025 Advent of Code unconventionally: using pure ClickHouse SQL.
48 min read
Includes Code
--
ClickHouse logo
ClickHouse
Advanced
The November 2025 edition of What's New in ClickStack highlights several new features and improvements in the open-source observability stack built for ClickHouse.
8 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article benchmarks five major cloud data warehousesโ€”Snowflake, Databricks, ClickHouse Cloud, BigQuery, and Redshiftโ€”across various scales of data to compare their cost-performance.
Tom Schreiber & Lionel Palacin
16 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article details the journey of upgrading the chDB kernel from ClickHouse v25. 5 to v25. 8. 2.
Victor Gao
18 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
The October 2025 edition of What's New in ClickStack highlights significant updates to the open-source observability stack for ClickHouse, including the introduction of alerting features, customiza...
9 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article discusses how to enhance log compression through log clustering techniques in ClickHouse, focusing on transforming unstructured logs into structured data for efficient storage.
Lionel Palacin
17 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article explores the process of tracing OpenAI agents using ClickStack, demonstrating how to build an OpenAI agent that interacts with ClickHouse and visualizes the decision-making process.
Mark Needham
14 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
This article discusses how to instrument a Next. js application using OpenTelemetry and ClickStack, focusing on the integration of observability and analytics through ClickHouse.
ClickHouse logo
ClickHouse
Intermediate
This article discusses the process of creating reproducible ZIP archives for AWS Lambda functions, focusing on challenges such as file order, timestamp management, and OS compatibility.
Misha Shiryaev
4 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article discusses the implementation of Change Data Capture (CDC) from Delta Lake to ClickHouse, detailing the architecture, components, and a reference implementation in Python.
Pete Hampton & Kelsey Schlarman
19 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
The article explores the potential need for Object-Relational Mappers (ORMs) in Online Analytical Processing (OLAP) environments, particularly focusing on ClickHouse.
Fiveonefour & ClickHouse Team
17 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article discusses the importance of LLM observability using ClickStack, OpenTelemetry, and MCP, highlighting how to instrument LibreChat for enhanced insights into AI-driven applications.
Dale McDiarmid & Lionel Palacin
15 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
The article explores the analysis of Wimbledon tennis data using ClickHouse, detailing the unique scoring system of tennis and how to implement a function to compute points needed to win a game.
Mark Needham
10 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
The article discusses the integration of ClickHouse with the Model Context Protocol (MCP), highlighting its benefits for connecting third-party services to large language models (LLMs).
Al Brown & Mark Needham
8 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article discusses how ClickHouse efficiently queries Parquet files, a key storage format for Lakehouse architectures, without requiring data ingestion.
Tom Schreiber
26 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
The article introduces AgentHouse, an interactive demo environment that integrates ClickHouse's real-time analytics with Anthropic's large language model, Claude Sonnet.
Dmitry Pavlov
5 min read
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article announces the launch of Ruby gem analytics powered by ClickHouse, enabling Ruby developers to analyze gem download data since 2017.
The ClickHouse & Ruby Central teams
21 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article provides a comprehensive guide on integrating PostgreSQL with ClickHouse using Change Data Capture (CDC) for real-time analytics.
Lionel Palacin & Sai Srirampur
23 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article explores the various input formats supported by ClickHouse for data ingestion, focusing on performance and efficiency.
Tom Schreiber
26 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article discusses the implementation of the Medallion architecture using ClickHouse, a powerful database management system.
PME Team
13 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
This article discusses building single page applications (SPAs) using ClickHouse with a focus on a 'client only' architecture.
Dale McDiarmid
22 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article explores the integration of the Perspective library with ClickHouse to create real-time visualizations of streaming Forex data.
Dale McDiarmid
14 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
This article discusses how to quickly integrate analytics into applications using ClickHouse Cloud Query Endpoints, highlighting the benefits of API endpoints for simplifying SQL interactions.
Dale McDiarmid
12 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article discusses how to model machine learning data in OLAP databases, specifically using ClickHouse as an example.
Dale McDiarmid
29 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
CryptoHouse is a free blockchain analytics platform powered by ClickHouse, enabling real-time SQL queries on blockchain data.
The ClickHouse & Goldsky teams
13 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
ClickHouse Release 24. 6 introduces 23 new features, 24 performance optimizations, and 59 bug fixes, enhancing its capabilities for data management and analysis.
The ClickHouse Team
17 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article compares ClickHouse and Elasticsearch in terms of performance for large-scale data analytics, particularly focusing on `count(*)` aggregations over billions of rows.
Tom Schreiber
32 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article compares ClickHouse and Elasticsearch, focusing on their mechanics for count aggregations.
Tom Schreiber
17 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article discusses how to implement K-Means clustering using ClickHouse SQL, demonstrating its efficiency in handling large datasets, such as 170 million NYC taxi rides, in under 3 minutes.
Dale McDiarmid
23 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article explores how ClickHouse can be utilized as a feature store to train machine learning models, specifically focusing on the integration with Featureform.
Dale McDiarmid
28 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
This article discusses the management of ClickHouse schemas as code using the Atlas tool, highlighting the transition from schema-less technologies to structured data management.
Rotem Tamir
6 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
ClickHouse Release 24. 2 introduces significant enhancements including 18 new features, 18 performance optimizations, and 49 bug fixes.
The ClickHouse Team
16 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article discusses ClickHouse's performance in handling large datasets, specifically addressing the 1 trillion row challenge.
Dale McDiarmid
19 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Advanced
This article details the process of building a chatbot named 'HackBot' that utilizes data from Hacker News and Stack Overflow, leveraging ClickHouse and LlamaIndex.
ClickHouse logo
ClickHouse
Beginner
This article explores the use of Apache Iceberg and ClickHouse to analyze global internet speeds using the Ookla dataset.
Dale McDiarmid
28 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
The article discusses a hybrid query execution experiment using ClickHouse, highlighting the use of ClickHouse Local and ClickPy to analyze GitHub metrics alongside PyPi package downloads.
Mark Needham
12 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article explores the role of Feature Stores in MLOps and how ClickHouse can enhance their performance and flexibility.
Dale McDiarmid
20 min read
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article explores the implementation of linear regression using ClickHouse's machine learning functions, focusing on predicting delivery times based on distance and pickup hour.
Ensemble
8 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article explores the implementation of Approximate Nearest Neighbor (ANN) vector search using SQL-powered Local Sensitive Hashing (LSH) and random projections in ClickHouse.
Dale McDiarmid & Alexey Milovidov
32 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article discusses the development of a Retrieval-Augmented Generation (RAG) pipeline for Google Analytics using ClickHouse and Amazon Bedrock.
Dale McDiarmid
30 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
The article discusses how to query Pandas DataFrames using ClickHouse through the chDB library, enabling users to leverage ClickHouse's SQL capabilities for data analysis.
Mark Needham
5 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
ClickHouse Release 23. 10 introduces a range of new features, performance optimizations, and bug fixes, enhancing its capabilities for data processing and analytics.
The ClickHouse Team
13 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
This article discusses how to leverage ClickHouse's machine learning functions for forecasting, specifically using Stochastic Linear Regression and evalMLMethod.
Ensemble
11 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
ClickHouse Release 23. 9 introduces a variety of new features, performance optimizations, and bug fixes aimed at enhancing user experience and functionality.
The ClickHouse Team
14 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Intermediate
ClickHouse Keeper is an open-source alternative to ZooKeeper, designed for better resource efficiency and performance in distributed systems.
Tom Schreiber and Derek Chia
19 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
The article discusses chDB, a Python module that embeds the ClickHouse OLAP engine, enabling efficient SQL execution on large datasets.
@Auxten
10 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
This article discusses optimizing large data loads in ClickHouse by focusing on performance and resource usage factors.
Tom Schreiber
12 min read
Includes Code
Has Summary
--
ClickHouse logo
ClickHouse
Beginner
ClickHouse Release 23. 8 introduces a variety of new features, performance optimizations, and bug fixes, enhancing the capabilities of this columnar database.
The ClickHouse Team
12 min read
Includes Code
Has Summary
--