{"title":"Filter differentiation: An effective approach to interpret convolutional neural networks","authors":"Yongkai Fan, Hongxue Bao, Xia Lei","doi":"10.1016/j.ins.2025.122253","DOIUrl":null,"url":null,"abstract":"<div><div>The lack of interpretability in deep learning poses a major challenge for AI security, as it hinders the detection and prevention of potential vulnerabilities. Understanding black-box models, such as Convolutional Neural Networks (CNNs), is crucial for establishing trust in them. Currently, filter disentanglement is a mainstream approach for interpreting CNNs, but existing efforts still face the problem of reducing filter entanglement without compromising model accuracy. Inspired by bionic theory, we propose a filter differentiation method that disentangles filters while improving model accuracy by simulating the process of pluripotent to unipotent cell differentiation. Specifically, by using a differentiation matrix based on attention mechanisms and an activation matrix based on mutual information between filters and classes, the convolutional weights of filters can be adjusted, allowing general filters in CNNs to be differentiated into specialized filters that respond only to specific classes. Experiments on benchmark datasets, including CIFAR-10, CIFAR-100, and TinyImageNet, show that our method achieves consistent improvements in model performance. It improves accuracy by 0.5% to 2% across various architectures, including ResNet18 and MobileNetV2, while enhancing filter interpretability as measured by Mutual Information Scores (MIS). These results demonstrate that our method achieves an effective balance between interpretability and accuracy.</div></div>","PeriodicalId":51063,"journal":{"name":"Information Sciences","volume":"716 ","pages":"Article 122253"},"PeriodicalIF":8.1000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0020025525003858","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
The lack of interpretability in deep learning poses a major challenge for AI security, as it hinders the detection and prevention of potential vulnerabilities. Understanding black-box models, such as Convolutional Neural Networks (CNNs), is crucial for establishing trust in them. Currently, filter disentanglement is a mainstream approach for interpreting CNNs, but existing efforts still face the problem of reducing filter entanglement without compromising model accuracy. Inspired by bionic theory, we propose a filter differentiation method that disentangles filters while improving model accuracy by simulating the process of pluripotent to unipotent cell differentiation. Specifically, by using a differentiation matrix based on attention mechanisms and an activation matrix based on mutual information between filters and classes, the convolutional weights of filters can be adjusted, allowing general filters in CNNs to be differentiated into specialized filters that respond only to specific classes. Experiments on benchmark datasets, including CIFAR-10, CIFAR-100, and TinyImageNet, show that our method achieves consistent improvements in model performance. It improves accuracy by 0.5% to 2% across various architectures, including ResNet18 and MobileNetV2, while enhancing filter interpretability as measured by Mutual Information Scores (MIS). These results demonstrate that our method achieves an effective balance between interpretability and accuracy.
期刊介绍:
Informatics and Computer Science Intelligent Systems Applications is an esteemed international journal that focuses on publishing original and creative research findings in the field of information sciences. We also feature a limited number of timely tutorial and surveying contributions.
Our journal aims to cater to a diverse audience, including researchers, developers, managers, strategic planners, graduate students, and anyone interested in staying up-to-date with cutting-edge research in information science, knowledge engineering, and intelligent systems. While readers are expected to share a common interest in information science, they come from varying backgrounds such as engineering, mathematics, statistics, physics, computer science, cell biology, molecular biology, management science, cognitive science, neurobiology, behavioral sciences, and biochemistry.