一种基于可解释注意力的黑匣子模型提取方法

IF 2.3 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems Pub Date : 2025-06-03 DOI:10.1111/exsy.70084

Lijun Gao, Huibin Tian, Kai Liu

{"title":"一种基于可解释注意力的黑匣子模型提取方法","authors":"Lijun Gao, Huibin Tian, Kai Liu","doi":"10.1111/exsy.70084","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 7","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Method for Extracting Black Box Models Based on Interpretable Attention\",\"authors\":\"Lijun Gao, Huibin Tian, Kai Liu\",\"doi\":\"10.1111/exsy.70084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.</p>\\n </div>\",\"PeriodicalId\":51053,\"journal\":{\"name\":\"Expert Systems\",\"volume\":\"42 7\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70084\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70084","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

深度神经网络在人脸识别方面取得了显著的成功。然而，它们的脆弱性引起了相当大的关注。研究人员可以通过提取人脸识别模型的功能来分析其弱点，以提高这些模型的安全性能。研究结果表明，目前的模型提取方法存在着明显的缺陷，即在捕获模型功能方面相似度低，样本可用性不足。这些限制严重阻碍了模型安全性能的分析。我们提出了一种可解释的基于注意力的黑盒模型提取方法，增强了替代模型和受害者模型功能之间的相似性。我们的主要贡献总结如下：(i)本研究解决了由于黑箱硬标签查询数量有限而导致的样本训练有限的问题。（ii）通过应用输入扰动，我们从深度黑盒模型中获得反馈，使我们能够识别对预测有积极影响的面部局部区域和特征权重的分布。（iii）通过对特征权重分布矩阵进行归一化，并将其与关注权重矩阵关联，实现对数据集的关注掩码的构建，实现对不同区域特征的差异化关注。(4)利用预训练的基础模型，提取相关知识和特征，促进跨领域知识转移。在Emore， PubFig和CASIA-WebFace上的实验表明，对于相同的查询预算，我们的方法在模型一致性方面比传统方法高出10%-20%。此外，我们的方法在三个数据集上实现了最高的模型窃取一致性：分别为94.51%，93.27%和91.74%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Method for Extracting Black Box Models Based on Interpretable Attention

Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems 工程技术-计算机：理论方法

CiteScore

7.40

自引率

6.10%

发文量

266

审稿时长

24 months

期刊介绍： Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper. As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.