Lukas Kölbl

Anomaly Detection - Using unsupervised Machine Learning for detecting anomalies in customer base

What happens when your labeled data is unusable? See how this team pivoted to an unsupervised model that successfully detected 84% of true customer outliers.

Anomaly Detection - Using unsupervised Machine Learning for detecting anomalies in customer base
#1about 5 minutes

The essential skills of a modern data scientist

A data scientist needs a blend of math, statistics, and technology skills, but business knowledge and communication are the most crucial for success.

#2about 2 minutes

Understanding the real data science project workflow

The majority of a data scientist's time is spent on data cleansing and feature engineering, not just model training, requiring close collaboration with business stakeholders.

#3about 3 minutes

Defining the customer anomaly detection use case

An insurance company sought to automate the detection of customer outliers to improve user experience, moving from a manual, time-consuming process to an unbiased, data-driven one.

#4about 4 minutes

Building the analytical record for the model

The project's core effort involved creating a master data table, or analytical record, which consumed 70% of the time and required shifting from a supervised to an unsupervised approach due to data quality issues.

#5about 3 minutes

Using robust PCA for explainable anomaly detection

A robust Principal Component Analysis (PCA) model was chosen to identify outliers by measuring reconstruction error after dimensionality reduction, offering a simple and explainable solution.

#6about 4 minutes

Analyzing model results and business impact

The model successfully detected 84% of true outliers, as shown by a confusion matrix and a traffic light visualization, significantly improving efficiency over manual processes.

Related jobs
Jobs that call for the skills explored in this talk.

Featured Partners

Related Articles

View all articles
CH
Chris Heilmann
With AIs wide open - WeAreDevelopers at All Things Open 2025
Last week our VP of Developer Relations, Chris Heilmann, flew to Raleigh, North Carolina to present at All Things Open . An excellent event he had spoken at a few times in the past and this being the “Lucky 13” edition, he didn’t hesitate to come and...
With AIs wide open - WeAreDevelopers at All Things Open 2025

From learning to earning

Jobs that call for the skills explored in this talk.

Data Scientist

Data Scientist

UL Solutions
Barcelona, Spain

API
Azure
Keras
Python
PyTorch
+8
LON Data Scientist

LON Data Scientist

Zurich Insurance
Charing Cross, United Kingdom

Remote
50K
GIT
Azure
Python
+2