Soyoun Won , Hyeon Bae Kim , Yong Hyun Ahn , Hong Joo Lee , Seong Tae Kim
{"title":"通过决策依赖理解深度神经网络的对抗鲁棒性","authors":"Soyoun Won , Hyeon Bae Kim , Yong Hyun Ahn , Hong Joo Lee , Seong Tae Kim","doi":"10.1016/j.imavis.2025.105743","DOIUrl":null,"url":null,"abstract":"<div><div>Adversarial robustness has become a major concern as machine learning models are increasingly deployed in high-risk and high-impact applications. Accordingly, various adversarial training strategies are proposed, making the model more robust under adversarial attack. However, similar to deep neural networks (DNNs) themselves, the mechanisms through which adversarial training strategies improve model robustness remain opaque. In this paper, we reveal how adversarial training alters the internal workings of deep neural networks by conducting neuron-wise decision reliance analysis. We find that adversarially vulnerable models predominantly rely on a small subset of predictive neurons while adversarially robust models tend to distribute their reliance across a broader range of neurons. We validate the relationship between decision reliance and adversarial robustness through comprehensive experiments across various models, training objectives, and attack scenarios. We observe that this relationship also holds for standard trained models, including those trained with Mixup or CutMix, which demonstrate improved performance against one-step adversarial attacks. Furthermore, we show that minimizing decision reliance leads to improved adversarial robustness. Our findings enrich the understanding of adversarially trained models and offer an interpretable and efficient approach to analyzing their internal mechanisms.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"163 ","pages":"Article 105743"},"PeriodicalIF":4.2000,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Understanding adversarial robustness of deep neural networks via decision reliance\",\"authors\":\"Soyoun Won , Hyeon Bae Kim , Yong Hyun Ahn , Hong Joo Lee , Seong Tae Kim\",\"doi\":\"10.1016/j.imavis.2025.105743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Adversarial robustness has become a major concern as machine learning models are increasingly deployed in high-risk and high-impact applications. Accordingly, various adversarial training strategies are proposed, making the model more robust under adversarial attack. However, similar to deep neural networks (DNNs) themselves, the mechanisms through which adversarial training strategies improve model robustness remain opaque. In this paper, we reveal how adversarial training alters the internal workings of deep neural networks by conducting neuron-wise decision reliance analysis. We find that adversarially vulnerable models predominantly rely on a small subset of predictive neurons while adversarially robust models tend to distribute their reliance across a broader range of neurons. We validate the relationship between decision reliance and adversarial robustness through comprehensive experiments across various models, training objectives, and attack scenarios. We observe that this relationship also holds for standard trained models, including those trained with Mixup or CutMix, which demonstrate improved performance against one-step adversarial attacks. Furthermore, we show that minimizing decision reliance leads to improved adversarial robustness. Our findings enrich the understanding of adversarially trained models and offer an interpretable and efficient approach to analyzing their internal mechanisms.</div></div>\",\"PeriodicalId\":50374,\"journal\":{\"name\":\"Image and Vision Computing\",\"volume\":\"163 \",\"pages\":\"Article 105743\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2025-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Image and Vision Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0262885625003312\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625003312","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Understanding adversarial robustness of deep neural networks via decision reliance
Adversarial robustness has become a major concern as machine learning models are increasingly deployed in high-risk and high-impact applications. Accordingly, various adversarial training strategies are proposed, making the model more robust under adversarial attack. However, similar to deep neural networks (DNNs) themselves, the mechanisms through which adversarial training strategies improve model robustness remain opaque. In this paper, we reveal how adversarial training alters the internal workings of deep neural networks by conducting neuron-wise decision reliance analysis. We find that adversarially vulnerable models predominantly rely on a small subset of predictive neurons while adversarially robust models tend to distribute their reliance across a broader range of neurons. We validate the relationship between decision reliance and adversarial robustness through comprehensive experiments across various models, training objectives, and attack scenarios. We observe that this relationship also holds for standard trained models, including those trained with Mixup or CutMix, which demonstrate improved performance against one-step adversarial attacks. Furthermore, we show that minimizing decision reliance leads to improved adversarial robustness. Our findings enrich the understanding of adversarially trained models and offer an interpretable and efficient approach to analyzing their internal mechanisms.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.