How We Built Uber Engineering’s Highest Query per Second Service Using Go

Kai Wei

Uber

•

Kai Wei

•6 min read•intermediate•

--

•View Original

JavaJSONNode.js

Overview

This article discusses the development of Uber Engineering's highest query per second service, which focuses on geofence lookups, utilizing the Go programming language. It highlights the architectural decisions, performance metrics, and the reasons for choosing Go over Node.js for this microservice.

What You'll Learn

1

How to implement a high-throughput geofence lookup service using Go

2

Why Go is preferred over Node.js for CPU-intensive workloads

3

How to maintain high availability and low latency in microservices

Prerequisites & Requirements

Basic understanding of microservices architecture
Familiarity with geofencing concepts(optional)

Key Questions Answered

How does Uber's geofence lookup service achieve high query performance?

Uber's geofence lookup service achieves high performance by utilizing Go for its low-latency and high-throughput capabilities, handling hundreds of thousands of queries per second with response times under 100 milliseconds at the 99th percentile. The architecture is stateless, allowing for efficient scaling and maintenance.

What architectural decisions were made for the geofence service?

The service was designed to be stateless, allowing any instance to handle requests uniformly. It employs a deterministic polling schedule to keep geofences data synchronized across instances, ensuring consistent query results.

Why was Go chosen over Node.js for this service?

Go was chosen for its performance in CPU-intensive tasks, such as geofence lookups, which require low latency. Unlike Node.js, which is single-threaded and less efficient for such workloads, Go's concurrency model allows for better handling of high query volumes.

What were the performance metrics achieved by the geofence service?

The service handled a peak load of 170,000 queries per second with a response time of less than 5 milliseconds at the 95th percentile and under 50 milliseconds at the 99th percentile. This performance was achieved with 40 machines running at 35% CPU usage.

Key Statistics & Figures

Peak load handled

170,000 QPS

Achieved during peak times with 40 machines at 35% CPU usage.

Response time at 95th percentile

less than 5 ms

Indicates the service's efficiency in handling queries.

Response time at 99th percentile

under 50 ms

Demonstrates the service's reliability under heavy load.

Service uptime

99.99%

Reflects the service's reliability since inception.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Backend

Go

Used to build the geofence lookup service due to its performance in handling CPU-intensive tasks.

Key Actionable Insights

1
Consider using Go for services that require high throughput and low latency.
Go's performance characteristics make it suitable for CPU-intensive applications, such as geofence lookups, where quick response times are critical.

2
Implement a stateless architecture to enhance scalability and maintainability.
Stateless services can handle requests from any instance, simplifying deployment and scaling processes, especially in cloud environments.

3
Utilize deterministic polling for data synchronization across service instances.
This approach ensures that all instances have up-to-date data without complex synchronization mechanisms, leading to consistent query results.

Common Pitfalls

1

Overcomplicating the architecture with unnecessary dependencies.

Keeping the service stateless and simple allows for easier scaling and maintenance, avoiding the pitfalls of complex architectures.

2

Neglecting the performance implications of using the wrong programming language.

Choosing a language like Node.js for CPU-intensive tasks can lead to performance bottlenecks, as seen in the comparison with Go.

Related Concepts

Microservices Architecture

Geofencing

Concurrency In Go

Performance Optimization Techniques