{"title":"Critical evaluation of feature importance assessment in FFNN-based models for predicting Kamlet-Taft parameters","authors":"Yoshiyasu Takefuji","doi":"10.1016/j.gce.2025.01.003","DOIUrl":null,"url":null,"abstract":"<div><div>Mohan et al. developed a feed-forward neural network (FFNN) model to predict Kamlet-Taft parameters using quantum chemically derived features, achieving notable predictive accuracy. However, this study raises concerns about conflating prediction accuracy with feature importance accuracy, as high R<sup>2</sup> and low root mean square error (RMSE) do not guarantee valid feature importance assessments. The reliance on SHapley Additive exPlanations (SHAP) for feature evaluation is problematic due to model-specific biases that could misrepresent true associations. A broader understanding of data distribution, statistical relationships, and significance testing through p-values is essential to rectify this. This paper advocates for employing robust statistical methods, like Spearman's correlation, to effectively assess genuine associations and mitigate biases in feature importance analysis.</div></div>","PeriodicalId":66474,"journal":{"name":"Green Chemical Engineering","volume":"6 3","pages":"Pages 289-290"},"PeriodicalIF":7.6000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Green Chemical Engineering","FirstCategoryId":"1089","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666952825000032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Mohan et al. developed a feed-forward neural network (FFNN) model to predict Kamlet-Taft parameters using quantum chemically derived features, achieving notable predictive accuracy. However, this study raises concerns about conflating prediction accuracy with feature importance accuracy, as high R2 and low root mean square error (RMSE) do not guarantee valid feature importance assessments. The reliance on SHapley Additive exPlanations (SHAP) for feature evaluation is problematic due to model-specific biases that could misrepresent true associations. A broader understanding of data distribution, statistical relationships, and significance testing through p-values is essential to rectify this. This paper advocates for employing robust statistical methods, like Spearman's correlation, to effectively assess genuine associations and mitigate biases in feature importance analysis.