This blog discusses the technical details of how we built Shepherd and how we are expanding the capabilities of Chronon to meet Stripe’s scale.
Overview
The article discusses how Stripe adapted Airbnb's Chronon platform to create Shepherd, a next-generation ML feature engineering platform that enhances the development and deployment of ML models at scale. It highlights the challenges of feature engineering in a high-volume environment and details the technical adaptations made to Chronon to meet Stripe's specific requirements.
What You'll Learn
How to adapt an existing ML feature engineering platform for large-scale applications
Why maintaining low latency and feature freshness is crucial in ML model deployment
How to implement a dual KV store for cost-efficient data management
When to use streaming platforms like Flink for low latency feature updates
Prerequisites & Requirements
- Understanding of ML feature engineering concepts
- Familiarity with Python and SQL
- Experience with data processing frameworks like Spark(optional)
Key Questions Answered
How did Stripe adapt Chronon for its ML feature engineering needs?
What are the latency and feature freshness requirements for ML models at Stripe?
What was the outcome of using Shepherd for fraud detection at Stripe?
What challenges did Stripe face in feature engineering at scale?
Key Statistics & Figures
Technologies & Tools
Some links below are affiliate links. We may earn a commission if you make a purchase.
Key Actionable Insights
1Implement a dual KV store to optimize data management for ML features.By splitting the KV store into a lower-cost bulk upload store and a higher-cost distributed store, you can balance cost and performance, ensuring that your ML models can access data quickly without incurring excessive storage costs.
2Utilize Flink for low latency streaming updates in your ML pipelines.Flink's stateful processing capabilities allow for efficient handling of feature updates, which is essential for applications that require real-time data processing, such as fraud detection.
3Regularly benchmark your feature engineering platform to ensure scalability.As your datasets grow, it's crucial to verify that your algorithms can handle increased loads without performance degradation. This proactive approach helps maintain the efficiency of your ML operations.