Improving Retrieval on Ramp with Transaction Embeddings

How we used triplet loss and embeddings to improve accounting on Ramp.

Calix Huang, Anton Biryukov
8 min readbeginner
--
View Original

Overview

The article discusses how Ramp improves transaction retrieval through the use of transaction embeddings, which help automate finance tasks like accounting coding. It details the approach taken to develop a model using triplet loss, the architecture of the model, and its applications in enhancing user experience and analytics.

What You'll Learn

1

How to use triplet loss for training embedding models

2

Why transaction embeddings improve accounting coding accuracy

3

How to represent transactions as embeddings for better retrieval

Prerequisites & Requirements

  • Understanding of machine learning concepts, particularly embedding models
  • Familiarity with the sentence_transformers library(optional)

Key Questions Answered

How does Ramp use transaction embeddings to improve accounting coding?
Ramp utilizes transaction embeddings to group similar transactions and predict GL categories for new transactions. By training a model with triplet loss, the system learns to cluster transactions by their respective GL categories, making it easier for employees to code their transactions accurately.
What is triplet loss and how is it applied in this context?
Triplet loss is a loss function used for supervised similarity learning that evaluates samples in triplets: an anchor, a positive sample, and a negative sample. In Ramp's case, it helps the model learn the differences between similar and dissimilar transactions, thus improving the accuracy of transaction embeddings.
What features are included in the transaction embeddings?
Transaction embeddings are enriched with features such as merchant name, merchant category name, department name, location name, amount, memo, spend program name, and trip name. These features help in accurately categorizing transactions for accounting purposes.
What are the applications of transaction embeddings at Ramp?
Transaction embeddings are used to surface GL coding suggestions in a dropdown for users, enhance analytics for growth and data teams, and synthesize relevant transactions for LLM-enabled features. This improves user interactions and operational efficiency.

Technologies & Tools

Library
Sentence_transformers
Used for training embedding models and computing dense vector text representations.

Key Actionable Insights

1
Implement triplet loss in your embedding models to enhance similarity learning.
Using triplet loss allows your model to better differentiate between similar and dissimilar items, which is crucial for applications like transaction categorization.
2
Enrich your transaction data with contextual features to improve embedding relevance.
By including features like merchant name and category, you can create more effective embeddings that enhance retrieval accuracy and user experience.
3
Utilize the sentence_transformers library for efficient embedding training.
This library provides a robust framework for training and fine-tuning embedding models, making it easier to implement advanced machine learning techniques.

Common Pitfalls

1
Randomly mining triplet samples can lead to ineffective learning and plateauing of the model.
This occurs because the model may quickly learn to differentiate between broad categories without grasping the nuances of similar subcategories. To avoid this, implement informed sampling strategies.