{"title":"Effective defense against physically embedded backdoor attacks via clustering-based filtering","authors":"Mohammed Kutbi","doi":"10.1007/s40747-025-01876-y","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Abstract</h3><p>Backdoor attacks pose a severe threat to the integrity of machine learning models, especially in real-world image classification tasks. In such attacks, adversaries embed malicious behaviors triggered by specific patterns in the training data, causing models to misclassify whenever the trigger is present. This paper introduces a novel, <i>model-agnostic</i> defense that systematically detects and removes backdoor-infected samples using a synergy of dimensionality reduction and unsupervised clustering. Unlike most existing methods that address <i>digitally</i> added triggers, our approach specifically targets <i>physically</i> embedded triggers (e.g., a bandage placed on a face), which closely resemble real-world occlusions and are therefore harder to detect. We first extract high-level features from a trusted, pre-trained model, reduce the feature dimensionality via Principal Component Analysis (PCA), and then fit Gaussian Mixture Models (GMMs) to cluster suspicious samples. By identifying and filtering out outlying clusters, we effectively isolate poisoned images without assuming knowledge of the trigger or requiring access to the victim model. Extensive experiments on face versus non-face classification demonstrate that our defense substantially reduces attack success rates while preserving high accuracy on clean data, offering a practical and robust solution against challenging backdoor scenarios.</p><h3 data-test=\"abstract-sub-heading\">Graphic Abstract</h3>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"108 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-025-01876-y","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Backdoor attacks pose a severe threat to the integrity of machine learning models, especially in real-world image classification tasks. In such attacks, adversaries embed malicious behaviors triggered by specific patterns in the training data, causing models to misclassify whenever the trigger is present. This paper introduces a novel, model-agnostic defense that systematically detects and removes backdoor-infected samples using a synergy of dimensionality reduction and unsupervised clustering. Unlike most existing methods that address digitally added triggers, our approach specifically targets physically embedded triggers (e.g., a bandage placed on a face), which closely resemble real-world occlusions and are therefore harder to detect. We first extract high-level features from a trusted, pre-trained model, reduce the feature dimensionality via Principal Component Analysis (PCA), and then fit Gaussian Mixture Models (GMMs) to cluster suspicious samples. By identifying and filtering out outlying clusters, we effectively isolate poisoned images without assuming knowledge of the trigger or requiring access to the victim model. Extensive experiments on face versus non-face classification demonstrate that our defense substantially reduces attack success rates while preserving high accuracy on clean data, offering a practical and robust solution against challenging backdoor scenarios.
期刊介绍:
Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.