{"title":"Reevaluating feature importance in machine learning: concerns regarding SHAP interpretations in the context of the EU artificial intelligence act","authors":"Yoshiyasu Takefuji","doi":"10.1016/j.watres.2025.123514","DOIUrl":null,"url":null,"abstract":"<div><div>This paper critically examines the analysis conducted by Maußner et al. on AI analysis, particularly their interpretation of feature importances derived from various machine learning models using SHAP (SHapley Additive exPlanations). Although SHAP aids in interpretability, it is subject to model-specific biases that can misrepresent relationships between variables. The paper emphasizes the lack of ground truth values in feature importance assessments and calls for careful consideration of statistical methodologies, including robust nonparametric approaches. By advocating for the use of Spearman's correlation with p-values and Kendall's tau with p-values, this work aims to strengthen the integrity of findings in machine learning studies, ensuring that conclusions drawn are reliable and actionable.</div></div>","PeriodicalId":443,"journal":{"name":"Water Research","volume":"280 ","pages":"Article 123514"},"PeriodicalIF":11.4000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Water Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0043135425004270","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
This paper critically examines the analysis conducted by Maußner et al. on AI analysis, particularly their interpretation of feature importances derived from various machine learning models using SHAP (SHapley Additive exPlanations). Although SHAP aids in interpretability, it is subject to model-specific biases that can misrepresent relationships between variables. The paper emphasizes the lack of ground truth values in feature importance assessments and calls for careful consideration of statistical methodologies, including robust nonparametric approaches. By advocating for the use of Spearman's correlation with p-values and Kendall's tau with p-values, this work aims to strengthen the integrity of findings in machine learning studies, ensuring that conclusions drawn are reliable and actionable.
期刊介绍:
Water Research, along with its open access companion journal Water Research X, serves as a platform for publishing original research papers covering various aspects of the science and technology related to the anthropogenic water cycle, water quality, and its management worldwide. The audience targeted by the journal comprises biologists, chemical engineers, chemists, civil engineers, environmental engineers, limnologists, and microbiologists. The scope of the journal include:
•Treatment processes for water and wastewaters (municipal, agricultural, industrial, and on-site treatment), including resource recovery and residuals management;
•Urban hydrology including sewer systems, stormwater management, and green infrastructure;
•Drinking water treatment and distribution;
•Potable and non-potable water reuse;
•Sanitation, public health, and risk assessment;
•Anaerobic digestion, solid and hazardous waste management, including source characterization and the effects and control of leachates and gaseous emissions;
•Contaminants (chemical, microbial, anthropogenic particles such as nanoparticles or microplastics) and related water quality sensing, monitoring, fate, and assessment;
•Anthropogenic impacts on inland, tidal, coastal and urban waters, focusing on surface and ground waters, and point and non-point sources of pollution;
•Environmental restoration, linked to surface water, groundwater and groundwater remediation;
•Analysis of the interfaces between sediments and water, and between water and atmosphere, focusing specifically on anthropogenic impacts;
•Mathematical modelling, systems analysis, machine learning, and beneficial use of big data related to the anthropogenic water cycle;
•Socio-economic, policy, and regulations studies.