DS Ph.D. Qualifier Presentation | Dennis Hofmann | Tuesday, June 10th @ 12:00PM, Gordon Library Conf. Rm. 303 | Walking a Fine Line: Embedding-Based Anomaly Generation at the Boundary of Normal

Tuesday, June 10, 2025
12:00 pm to 1:00 pm
Floor/Room #
303 Conference Room

DATA SCIENCE

Ph.D. Qualifier Presentation

Dennis Hofmann

Tuesday, June 10, 2025

12:00-1:00 pm

Gordon Library 303 Conference Room

Committee:

Professor Elke Rundensteiner, WPI. Advisor 

Professor Roee Shraga, WPI 

Professor Randy Paffenroth, WPI

 

Title: Walking a Fine Line: Embedding-Based Anomaly Generation at the Boundary of Normal
 

Abstract:

Anomaly detection, which aims to identify instances that deviate from expected behavior, is critical in domains for preventing cyberattacks, system malfunction, and financial fraud. However, the scarcity of reliable anomaly labels necessitates unsupervised approaches that rely solely on inlier data. Traditional methods, based on one-class classification, attempt to model the inlier distribution and then flag deviations as anomalies. However, these techniques often suffer from representation collapse due to their lack of exposure to anomalies during training, leading to high false negative rates. To overcome this limitation,  recent research began to explore the use of synthetic anomalies to enhance unsupervised anomaly detection. While promising, they struggle due to relying on the assumption that generated anomalies resemble true anomalies. We instead propose EAGL, the first robust embedding-based approach for synthetic anomaly generation that overcomes these limitations. EAGL constructs an anomaly-centric embedding space for generating diverse yet probable synthetic anomalies. EAGL updates this embedding space iteratively using both the original inliers and the generated anomalies, employing a robust training scheme in which low-confidence synthetic anomalies are down-weighted to mitigate the risk of propagating errors. Our experimental study on 15 real-world benchmark datasets demonstrates that EAGL consistently outperforms state-of-the-art anomaly generation techniques by up to 0.37 in F-1 Score (0.38 → 0.75).


 


 

Audience(s)

Department(s):

Data Science
Contact Person
Kelsey Briggs

Phone Number: