Cadence Multi-Tenant Task Processing

Yichao Yang
14 min readadvanced
--
View Original

Overview

The article discusses the challenges and solutions related to multi-tenant task processing in Cadence, an orchestration framework used at Uber. It highlights the introduction of host-level priority task processing and multi-queue/cursor processing to improve resource isolation and manage bursty workloads effectively.

What You'll Learn

1

How to implement host-level priority task processing in Cadence

2

Why multi-queue processing is essential for resource isolation

3

When to apply split policies for task management

Prerequisites & Requirements

  • Understanding of multi-tenancy and task processing concepts
  • Experience with orchestration frameworks like Cadence(optional)

Key Questions Answered

What are the main challenges of multi-tenant task processing in Cadence?
The main challenges include resource contention due to bursty workloads from one customer affecting others, and the inefficiency of processing tasks in the order they are generated, which can lead to delays for other customers. These issues necessitated a redesign of the task processing logic.
How does host-level priority task processing improve resource management?
Host-level priority task processing allows tasks from different shards to be processed by a shared worker pool, reducing the total number of workers needed. This approach helps prevent database overload during peak loads and ensures fair resource distribution among customers.
What is the purpose of the split policy in multi-queue processing?
The split policy governs when to create separate queues for different domains based on the number of buffered tasks. It helps manage memory pressure and ensures that tasks are processed efficiently without overloading the system.
What results were achieved with the new task processing logic?
The new task processing logic reduced the number of worker Goroutines from 16,000 to 100, achieving over a 95% reduction while maintaining the same load. This improvement prevents database overload during peak times and enhances overall system stability.

Key Statistics & Figures

Reduction in worker Goroutines
16,000 to 100
This reduction was achieved while handling the same load, ensuring database stability during peak times.
Percentage reduction in worker resources
over 95%
This significant decrease helps prevent database overload during bursts of activity.

Technologies & Tools

Orchestration Framework
Cadence
Used for managing workflows and task processing at Uber.

Key Actionable Insights

1
Implement host-level priority task processing to optimize resource allocation across multiple tenants.
This approach allows for efficient task management and prevents resource contention, particularly during peak loads, ensuring that all customers receive fair treatment.
2
Utilize split policies to manage task queues effectively and prevent memory issues.
By monitoring the number of buffered tasks and creating new queues as needed, you can maintain system performance and avoid bottlenecks.
3
Regularly review and adjust task processing thresholds based on observed production traffic.
Tuning these parameters can lead to significant performance improvements and better resource utilization, especially in a multi-tenant environment.

Common Pitfalls

1
Failing to adequately tune the task processing parameters can lead to inefficiencies.
Without proper tuning, systems may experience high latency or overload, especially under burst loads. Regular monitoring and adjustments based on traffic patterns are essential.
2
Ignoring the impact of bursty workloads on resource allocation can disrupt service.
When one customer generates a large number of tasks, it can block others. Implementing resource isolation strategies is crucial to maintain service levels.

Related Concepts

Multi-tenancy In Software Architecture
Task Processing Frameworks
Resource Management Strategies