Risk Entity Watch – Using Anomaly Detection to Fight Fraud

Sergey Zelvenskiy, Becky Hui, Sahana Noru, Christopher Settles

Uber

•

Sergey Zelvenskiy, Becky Hui, Sahana Noru, Christopher Settles

•13 min read•advanced•

--

•View Original

ApacheMachine Learning

Overview

The article discusses Uber's Risk Entity Watch platform, which employs anomaly detection to combat fraud within its marketplace. It highlights the use of unsupervised machine learning models to identify potentially fraudulent entities and the importance of feature engineering and explainability in the fraud detection process.

What You'll Learn

1

How to implement unsupervised machine learning models for fraud detection

2

Why feature engineering is critical in anomaly detection systems

3

How to use the HAIFA algorithm for anomaly explanation

Prerequisites & Requirements

Understanding of machine learning concepts and anomaly detection
Familiarity with Python and data science libraries(optional)

Key Questions Answered

How does Risk Entity Watch detect fraudulent activities?

Risk Entity Watch utilizes unsupervised machine learning models to analyze unlabeled data sets and flag entities that may be engaging in fraudulent activities. The results are then reviewed by agents for further action, ensuring a robust fraud detection mechanism.

What role does feature engineering play in anomaly detection?

Feature engineering is crucial as it involves generating relevant metrics for entities involved in events, which helps in assessing risk and identifying anomalies. The platform provides a baseline set of metrics that can be computed for every entity, facilitating effective anomaly detection.

What is the HAIFA algorithm and how is it used?

The HAIFA algorithm, which stands for Histogram Analysis of Important Features for Anomalies, identifies features that contribute to anomalies by analyzing the distribution of feature values. It helps in explaining why certain observations are flagged as anomalous, enhancing the interpretability of the model.

Technologies & Tools

Some links below are affiliate links. We may earn a commission if you make a purchase.

Database

Apache Hive

Used for defining training sets in the Risk Entity Watch platform.

Programming Language

Python

Used for scripting and model development in the Risk Entity Watch platform.

Key Actionable Insights

1
Implement unsupervised machine learning models to enhance fraud detection capabilities.
Using unsupervised methods allows for the identification of new fraud patterns without the need for labeled data, making the system adaptable to evolving fraud tactics.

2
Focus on feature engineering to improve the accuracy of anomaly detection.
By generating relevant metrics for each entity, data scientists can better assess risks and enhance the model's ability to detect anomalies.

3
Utilize the HAIFA algorithm for better anomaly explanations.
This will help operational teams understand the reasons behind flagged anomalies, facilitating more informed decision-making during manual reviews.

Common Pitfalls

1

Over-reliance on supervised methods for fraud detection can lead to management challenges as fraud classes expand.

As the business grows, the variety of fraud cases increases, making it difficult to manage solely with supervised techniques. Transitioning to unsupervised methods can alleviate this issue.

Related Concepts

Anomaly Detection Techniques

Feature Engineering In Machine Learning

Unsupervised Vs Supervised Learning

Fraud Detection Methodologies