Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe
{"title":"对DNN和GBDT的后门攻击:来自保险领域的案例研究","authors":"Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe","doi":"10.1002/asmb.70029","DOIUrl":null,"url":null,"abstract":"<p>Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.</p>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 5","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70029","citationCount":"0","resultStr":"{\"title\":\"Backdoor Attacks on DNN and GBDT: A Case Study From the Insurance Domain\",\"authors\":\"Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe\",\"doi\":\"10.1002/asmb.70029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.</p>\",\"PeriodicalId\":55495,\"journal\":{\"name\":\"Applied Stochastic Models in Business and Industry\",\"volume\":\"41 5\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70029\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Stochastic Models in Business and Industry\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70029\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70029","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Backdoor Attacks on DNN and GBDT: A Case Study From the Insurance Domain
Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.
期刊介绍:
ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process.
The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.