{"title":"一种基于可解释注意力的黑匣子模型提取方法","authors":"Lijun Gao, Huibin Tian, Kai Liu","doi":"10.1111/exsy.70084","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.</p>\n </div>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"42 7","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Method for Extracting Black Box Models Based on Interpretable Attention\",\"authors\":\"Lijun Gao, Huibin Tian, Kai Liu\",\"doi\":\"10.1111/exsy.70084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.</p>\\n </div>\",\"PeriodicalId\":51053,\"journal\":{\"name\":\"Expert Systems\",\"volume\":\"42 7\",\"pages\":\"\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-06-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70084\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.70084","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A Method for Extracting Black Box Models Based on Interpretable Attention
Deep neural networks have achieved remarkable success in face recognition. However, their vulnerability has attracted considerable attention. Researchers can analyse the weaknesses of face recognition models by extracting their functionality, aiming to enhance the security performance of these models. The findings of the study reveal that current model extraction methods are afflicted with notable drawbacks, namely low similarity in capturing model functionality and insufficient availability of samples. These limitations significantly impede the analysis of model security performance. We propose an interpretable attention-based method for black-box model extraction, enhancing the similarity between substitute and victim model functionality. Our main contributions are summarized as follows: (i) This study addresses the issue of limited sample training caused by the restricted number of black-box hard label queries. (ii) By applying input perturbations, we obtain feedback from deep black-box models, enabling us to identify facial local regions and the distribution of feature weights that positively influence predictions. (iii) By normalizing the feature weight distribution matrix and associating it with the attention weight matrix, the construction of an attention mask for the dataset is achieved, enabling differential attention to features in different regions. (iv) Leveraging a pre-trained base model, we extract relevant knowledge and features, facilitating cross-domain knowledge transfer. Experiments on Emore, PubFig and CASIA-WebFace show that our method outperforms traditional methods by 10%–20% in model consistency for the same query budget. Also, our method achieves the highest model stealing consistency on the three datasets: 94.51%, 93.27% and 91.74%, respectively.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.