Augmented Commerce: Machine Learning at Shopify

Commerce is a dynamic ecosystem where our mission is to empower every merchant to succeed. We optimize each step of their journey—from product creation to customer delivery—using advanced tools, infrastructure, and partnerships to solve a complex optimization challenge.

Javier Moreno
6 min readadvanced
--
View Original

Overview

The article discusses how Shopify leverages machine learning to enhance the commerce experience for its merchants. It outlines the various optimization problems Shopify addresses through ML, the technologies used, and the team culture that drives innovation.

What You'll Learn

1

How to classify and enrich product metadata using Qwen multimodal models

2

Why fast risk models are essential for assessing transaction fraud

3

How to forecast merchant GMV using tabular transformer models

4

When to utilize sequence-based foundational models for understanding merchant behavior

Key Questions Answered

What machine learning models does Shopify use for product classification?
Shopify uses finetuned Qwen multimodal models to classify and enrich the metadata of products uploaded into their system. This process involves making hundreds of millions of inferences daily, helping merchants better present their products.
How does Shopify assess fraud in transactions?
Shopify employs fast risk models to evaluate the fraud risk of every transaction. They draw inspiration from research by Feature Space to enhance their fraud detection capabilities, ensuring a safer commerce environment for merchants and customers.
What technologies does Shopify use for machine learning infrastructure?
Shopify partners with GCP as its main infrastructure provider and collaborates with neo-cloud providers like Nebius to access large training clusters. This setup allows for rapid iteration and experimentation in machine learning projects.
What is the purpose of the Sidekick assistant at Shopify?
Sidekick is a multi-purpose merchant assistant built using a combination of fine-tuned LLaMa models and large general models. It helps merchants utilize Shopify's features to their fullest potential, enhancing their overall business operations.

Key Statistics & Figures

Daily inferences for product metadata classification
hundreds of millions
This volume reflects the scale at which Shopify operates and the complexity of the data involved.

Technologies & Tools

Machine Learning
Qwen Multimodal Models
Used for classifying and enriching product metadata.
Machine Learning
Llama Models
Utilized in building the Sidekick assistant for merchants.
Machine Learning
Tabular Transformer Models
Developed for forecasting merchant GMV.
Machine Learning
Hstu Architecture
Experimented with for understanding merchant and customer behavior.
Machine Learning
Nomic Embeddings
Used for vector representations of products to enhance search and recommendations.

Key Actionable Insights

1
Investing in machine learning infrastructure is crucial for optimizing merchant success.
By leveraging cloud partnerships and advanced models, Shopify can enhance its services and provide better tools for merchants to thrive in a competitive environment.
2
Utilizing multimodal models can significantly improve product metadata classification.
This approach allows Shopify to process vast amounts of product data efficiently, enhancing the shopping experience for customers and increasing sales opportunities for merchants.
3
Implementing fast risk models can greatly reduce transaction fraud.
By assessing fraud in real-time, Shopify protects both merchants and customers, fostering trust and security in the commerce ecosystem.
4
Understanding customer behavior through sequence-based models can inform better business strategies.
By analyzing actions and objectives, Shopify can provide tailored recommendations that help merchants make informed decisions.

Common Pitfalls

1
Failing to leverage the full capabilities of machine learning tools can limit merchant success.
Merchants who do not utilize available features may miss out on significant opportunities for growth and optimization.