Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys
{"title":"Evaluating feature attribution methods in the image domain","authors":"Arne Gevaert, Axel-Jan Rousseau, Thijs Becker, Dirk Valkenborg, Tijl De Bie, Yvan Saeys","doi":"10.1007/s10994-024-06550-x","DOIUrl":null,"url":null,"abstract":"<p>Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, the objective evaluation of such attribution maps remains an open problem. Building on previous work in this domain, we investigate existing quality metrics and propose new variants of metrics for the evaluation of attribution maps. We confirm a recent finding that different quality metrics seem to measure different underlying properties of attribution maps, and extend this finding to a larger selection of attribution methods, quality metrics, and datasets. We also find that metric results on one dataset do not necessarily generalize to other datasets, and methods with desirable theoretical properties do not necessarily outperform computationally cheaper alternatives in practice. Based on these findings, we propose a general benchmarking approach to help guide the selection of attribution methods for a given use case. Implementations of attribution metrics and our experiments are available online (https://github.com/arnegevaert/benchmark-general-imaging).</p><h3 data-test=\"abstract-sub-heading\">Graphical abstract</h3>\n","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"17 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06550-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Feature attribution maps are a popular approach to highlight the most important pixels in an image for a given prediction of a model. Despite a recent growth in popularity and available methods, the objective evaluation of such attribution maps remains an open problem. Building on previous work in this domain, we investigate existing quality metrics and propose new variants of metrics for the evaluation of attribution maps. We confirm a recent finding that different quality metrics seem to measure different underlying properties of attribution maps, and extend this finding to a larger selection of attribution methods, quality metrics, and datasets. We also find that metric results on one dataset do not necessarily generalize to other datasets, and methods with desirable theoretical properties do not necessarily outperform computationally cheaper alternatives in practice. Based on these findings, we propose a general benchmarking approach to help guide the selection of attribution methods for a given use case. Implementations of attribution metrics and our experiments are available online (https://github.com/arnegevaert/benchmark-general-imaging).
期刊介绍:
Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.