Improving Pinterest Search Relevance Using Large Language Models

Pinterest Engineering

•

Pinterest Engineering

•7 min read•advanced•

--

•View Original

BERTBLIPHugging FaceLarge Language ModelsMachine LearningRoBERTaSupervised LearningT5

Overview

The article discusses the implementation of a Large Language Model (LLM)-based relevance system for Pinterest Search, detailing its technical design, model architecture, and the results from both offline and online experiments. It highlights the improvements in search relevance and fulfillment rates achieved through this innovative approach.

What You'll Learn

1

How to implement a cross-encoder language model for search relevance prediction

2

Why knowledge distillation is essential for scaling LLMs in production

3

How to enrich text features for improved relevance modeling

Prerequisites & Requirements

Understanding of machine learning concepts and model training
Familiarity with LLMs and their applications in search systems(optional)

Key Questions Answered

How does Pinterest improve search relevance using LLMs?

Pinterest enhances search relevance by implementing a cross-encoder language model that predicts the relevance of Pins to user queries. This model is fine-tuned with human-annotated data, allowing for a more accurate alignment of search results with user intent.

What metrics were used to evaluate the effectiveness of the new relevance model?

The effectiveness of the new relevance model was evaluated using metrics such as nDCG@K and 5-scale relevance predictions. The model showed a +2.18% improvement in search feed relevance and increased fulfillment rates across various countries.

What are the key features used in the student relevance model?

The student relevance model utilizes query-level features, Pin-level features, and query-Pin interaction features. This includes embeddings from SearchSAGE and PinSAGE, as well as historical engagement rates to enhance relevance predictions.

Key Statistics & Figures

Improvement in search feed relevance

+2.18%

Measured by nDCG@20 after implementing the new relevance model.

Performance increase of Llama-3–8B over multilingual BERT-base

12.5%

This performance was measured in terms of 5-scale accuracy during model comparisons.

Performance increase of Llama-3–8B over the baseline model

19.7%

The baseline model relied solely on SearchSAGE embeddings.

Technologies & Tools

ML Model

Bert

Used as a baseline for the cross-encoder architecture.

ML Model

T5

Evaluated as one of the pre-trained language models for relevance prediction.

ML Model

Llama-3

Demonstrated superior performance in relevance predictions.

ML Model

Blip

Used for generating synthetic image captions.

ML Model

Pinsage

Provides embeddings for Pins to enhance relevance modeling.

ML Model

Searchsage

Utilized for query and Pin embeddings.

Key Actionable Insights

1
Implementing a cross-encoder model can significantly enhance the accuracy of search relevance predictions.
This approach allows for a more nuanced understanding of user queries and Pin content, which is essential in improving user satisfaction and engagement.

2
Utilizing knowledge distillation can help in scaling LLMs effectively for real-time applications.
By distilling a larger model into a smaller, more efficient one, organizations can maintain high performance while reducing latency and operational costs.

3
Enriching text features with metadata and user engagement data can lead to better relevance modeling.
Incorporating diverse data sources ensures that the model captures a comprehensive view of user intent, improving the overall search experience.

Common Pitfalls

1

Relying solely on historical engagement data can lead to biased relevance predictions.

This occurs because engagement metrics may not accurately reflect the current relevance of content, especially as user interests evolve. It's crucial to incorporate diverse data sources to ensure the model remains aligned with user intent.

Related Concepts

Machine Learning

Large Language Models

Search Relevance Systems

Knowledge Distillation

Text Feature Engineering