对DNN和GBDT的后门攻击：来自保险领域的案例研究

IF 1.5 4区数学 Q3 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Applied Stochastic Models in Business and Industry Pub Date : 2025-09-07 DOI:10.1002/asmb.70029

Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe

{"title":"对DNN和GBDT的后门攻击：来自保险领域的案例研究","authors":"Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe","doi":"10.1002/asmb.70029","DOIUrl":null,"url":null,"abstract":"<p>Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.</p>","PeriodicalId":55495,"journal":{"name":"Applied Stochastic Models in Business and Industry","volume":"41 5","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70029","citationCount":"0","resultStr":"{\"title\":\"Backdoor Attacks on DNN and GBDT: A Case Study From the Insurance Domain\",\"authors\":\"Robin Kühlem, Daniel Otten, Daniel Ludwig, Anselm Hudde, Alexander Rosenbaum, Andreas Mauthe\",\"doi\":\"10.1002/asmb.70029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.</p>\",\"PeriodicalId\":55495,\"journal\":{\"name\":\"Applied Stochastic Models in Business and Industry\",\"volume\":\"41 5\",\"pages\":\"\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2025-09-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/asmb.70029\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Stochastic Models in Business and Industry\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70029\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Stochastic Models in Business and Industry","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/asmb.70029","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

机器学习（ML）很可能在未来的许多流程中发挥重要作用，在保险业也是如此。然而，ML模型面临被攻击和操纵的风险。被后门攻击破坏的模型失去了完整性，不再被认为是值得信赖的。确保机器学习模型的可信度至关重要，因为受损的模型可能会给保险公司带来重大的财务和声誉损失。在这项工作中，评估了梯度增强决策树（GBDT）模型和深度神经网络（DNN）在保险环境中的鲁棒性。因此，两个GBDT模型和两个dnn在来自保险上下文的两个不同的表格数据集上进行训练。过去在该领域的研究主要使用同构数据，对于异构表格数据的见解相对较少。在数据集上执行的ML任务是索赔预测（回归）和欺诈检测（二元分类）。对于后门攻击，包含特定模式的不同样本被精心制作并添加到训练数据中。结果表明，即使添加少量样本，这种类型的攻击也可以非常成功。后门攻击在一个数据集上训练的模型上效果很好，但在另一个数据集上训练的模型上效果很差。在现实场景中，攻击者将不得不面对一些障碍，但由于攻击可以使用很少的添加样本，因此应该评估这种风险。因此，理解和减轻这些风险对于在关键应用程序中可靠地部署ML至关重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Backdoor Attacks on DNN and GBDT: A Case Study From the Insurance Domain

查看原文本刊更多论文

Backdoor Attacks on DNN and GBDT: A Case Study From the Insurance Domain

Machine learning (ML) will most likely play a large role in many processes in the future, also in the insurance industry. However, ML models are at risk of being attacked and manipulated. A model compromised by a backdoor attack loses its integrity and can no longer be deemed trustworthy. Ensuring the trustworthiness of ML models is crucial, as compromised models can lead to significant financial and reputational damage for insurance companies. In this work the robustness of Gradient Boosted Decision Tree (GBDT) models and Deep Neural Networks (DNN) within an insurance context is evaluated. Therefore, two GBDT models and two DNNs are trained on two different tabular datasets from an insurance context. Past research in this domain mainly used homogeneous data and there are comparably little insights regarding heterogeneous tabular data. The ML tasks performed on the datasets are claim prediction (regression) and fraud detection (binary classification). For the backdoor attacks different samples containing a specific pattern were crafted and added to the training data. It is shown, that this type of attack can be highly successful, even with a few added samples. The backdoor attacks worked well on the models trained on one dataset but poorly on the models trained on the other. In real-world scenarios the attacker will have to face several obstacles but as attacks can work with very few added samples this risk should be evaluated. Therefore, understanding and mitigating these risks is essential for the reliable deployment of ML in critical applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Stochastic Models in Business and Industry 数学-数学跨学科应用

CiteScore

2.70

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： ASMBI - Applied Stochastic Models in Business and Industry (formerly Applied Stochastic Models and Data Analysis) was first published in 1985, publishing contributions in the interface between stochastic modelling, data analysis and their applications in business, finance, insurance, management and production. In 2007 ASMBI became the official journal of the International Society for Business and Industrial Statistics (www.isbis.org). The main objective is to publish papers, both technical and practical, presenting new results which solve real-life problems or have great potential in doing so. Mathematical rigour, innovative stochastic modelling and sound applications are the key ingredients of papers to be published, after a very selective review process. The journal is very open to new ideas, like Data Science and Big Data stemming from problems in business and industry or uncertainty quantification in engineering, as well as more traditional ones, like reliability, quality control, design of experiments, managerial processes, supply chains and inventories, insurance, econometrics, financial modelling (provided the papers are related to real problems). The journal is interested also in papers addressing the effects of business and industrial decisions on the environment, healthcare, social life. State-of-the art computational methods are very welcome as well, when combined with sound applications and innovative models.