PinCompute: A Kubernetes Backed General Purpose Compute Platform for Pinterest

Pinterest Engineering
21 min readadvanced
--
View Original

Overview

PinCompute is a Kubernetes-backed general-purpose compute platform developed by Pinterest to enhance application management and efficiency. It abstracts infrastructure complexities, allowing developers to focus on application-specific needs while providing scalability and cost savings through innovative scheduling and resource management.

What You'll Learn

1

How to leverage PinCompute APIs for workload management

2

Why multi-tenancy is crucial for resource efficiency in cloud platforms

3

How to implement application auto-scaling using PinScaler

Prerequisites & Requirements

  • Understanding of Kubernetes and cloud-native principles
  • Familiarity with API usage and management(optional)

Key Questions Answered

What is PinCompute and how does it improve application management?
PinCompute is a Kubernetes-based compute platform that simplifies application management by abstracting infrastructure complexities. It allows developers to focus on application-specific needs while providing scalability and cost savings through innovative scheduling and resource management.
How does PinCompute handle workload orchestration?
PinCompute uses a federation control plane to manage workloads across multiple member Kubernetes clusters. It performs tasks such as quota enforcement, workload sharding, and member cluster selection to ensure efficient workload execution.
What are the key primitives introduced by PinCompute?
PinCompute introduces three key primitives: PinPod for general-purpose compute, PinApp for managing long-running applications, and PinScaler for application auto-scaling. These primitives enhance the platform's ability to handle diverse workloads efficiently.
What resource tiers does PinCompute support?
PinCompute supports three resource tiers: Reserved, OnDemand, and Preemptible. Users can define resource quotas for each tier, allowing for flexible resource management based on workload requirements.

Key Statistics & Figures

Kubernetes cluster capacity
3000 nodes, 120k pods
Each PinCompute Kubernetes cluster is optimized for these specifications to meet Pinterest's compute requirements.
API availability
99.9%
PinCompute ensures this availability for critical workload orchestration related APIs.
P99 workload end-to-end launch latency
25 seconds
This latency is a target for the performance of workload launches on the platform.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Orchestration
Kubernetes
Used as the foundational technology for managing containerized applications.
Scaling
Pinscaler
Facilitates application auto-scaling based on defined metrics.

Key Actionable Insights

1
Utilize PinCompute's workload APIs to streamline application deployment processes.
By leveraging these APIs, developers can automate workload management tasks, reducing manual overhead and improving deployment efficiency.
2
Implement PinScaler for dynamic application scaling based on real-time metrics.
This ensures that applications can adapt to varying loads, optimizing resource usage and maintaining performance during peak times.
3
Adopt multi-tenancy practices to enhance resource utilization across teams.
This approach minimizes resource wastage and promotes cost efficiency, allowing multiple teams to share the same infrastructure securely.

Common Pitfalls

1
Failing to properly configure resource quotas can lead to inefficient resource utilization.
Without adequate quotas, workloads may compete for limited resources, resulting in performance degradation and increased costs.
2
Neglecting the importance of multi-tenancy can lead to security vulnerabilities.
If not properly managed, shared resources can expose sensitive data between different teams or applications.

Related Concepts

Cloud-native Architecture
Microservices
Platform As A Service (paas)
Workload Orchestration