Re-architecting Slack’s Workspace Preferences: How to Move to an EAV Model to Support Scalability

Scaling is hard. Design decisions that initially seemed reasonable break down with little warning, and suddenly even the simplest parts of your data model need to go through a complex re-architecture. We’re tackling this problem at Slack. A lot of our early design decisions made sense for small workspaces, but can be inefficient for large…

Alisha Ukani
11 min readadvanced
--
View Original

Overview

This article discusses the re-architecture of Slack's workspace preferences by transitioning to an Entity/Attribute/Value (EAV) model to enhance scalability. It highlights the challenges faced with the existing JSON blob storage method and details the steps taken to implement the new data model.

What You'll Learn

1

How to re-architect a data model using the EAV pattern

2

Why caching strategies need to adapt with changing data models

3

When to use double writes for data migration

Prerequisites & Requirements

  • Understanding of database design principles
  • Familiarity with Vitess for database management(optional)
  • Experience with data migration techniques

Key Questions Answered

What issues arise from using a large JSON blob for workspace preferences?
Using a large JSON blob for workspace preferences leads to inefficiencies, as it requires querying the entire workspaces table for single preference access. This can overwhelm the database and reduce reliability, especially when multiple queries are needed for different preferences.
How does the EAV model improve data management for workspace preferences?
The EAV model allows workspace preferences to be stored as individual rows, which prevents unnecessary data retrieval and simplifies the addition of new preferences without altering the table structure. This enhances scalability and performance as Slack grows.
What steps were taken to migrate existing workspace preferences to the new model?
The migration involved creating a new EAV table, implementing double writes during updates to ensure data consistency, and running backfill scripts to populate the new table with existing preferences. This ensured a smooth transition without data loss.
What challenges were encountered during the re-architecture process?
Challenges included ensuring data consistency during migration, managing cache invalidation effectively, and addressing bugs that arose from the separation of workspace preferences from the main workspace object. These issues required careful planning and testing.

Key Statistics & Figures

Number of workspace preferences offered
165
This number reflects the growing customization options available to users in Slack.
Number of paid workspaces utilizing the new model
70,000+
This indicates the scale at which the new EAV model is being applied within Slack's infrastructure.
Size of workspace preference JSON blob
larger than 55 kB
This size can negatively impact performance when frequently accessed.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Key Actionable Insights

1
Implement an EAV model when dealing with a growing number of attributes in a database to maintain performance and scalability.
This approach allows for flexibility in adding new attributes without requiring schema changes, which can be costly and time-consuming.
2
Utilize double writes during data migration to ensure data integrity across old and new systems.
This method helps prevent data loss and inconsistencies, making the transition smoother for users.
3
Adopt a gradual rollout strategy, such as 'dark mode' reads, to test new implementations without disrupting existing functionality.
This allows for identifying and fixing issues in real-time while ensuring that users experience minimal disruption.

Common Pitfalls

1
Failing to account for data consistency during migration can lead to discrepancies between old and new data models.
This often occurs when updates are made to the old model without corresponding updates to the new model, which can confuse users and lead to data integrity issues.

Related Concepts

Database Design Principles
Data Migration Techniques
Caching Strategies