{"title":"基于ffnn的Kamlet-Taft参数预测模型中特征重要性评估的关键评价","authors":"Yoshiyasu Takefuji","doi":"10.1016/j.gce.2025.01.003","DOIUrl":null,"url":null,"abstract":"<div><div>Mohan et al. developed a feed-forward neural network (FFNN) model to predict Kamlet-Taft parameters using quantum chemically derived features, achieving notable predictive accuracy. However, this study raises concerns about conflating prediction accuracy with feature importance accuracy, as high R<sup>2</sup> and low root mean square error (RMSE) do not guarantee valid feature importance assessments. The reliance on SHapley Additive exPlanations (SHAP) for feature evaluation is problematic due to model-specific biases that could misrepresent true associations. A broader understanding of data distribution, statistical relationships, and significance testing through p-values is essential to rectify this. This paper advocates for employing robust statistical methods, like Spearman's correlation, to effectively assess genuine associations and mitigate biases in feature importance analysis.</div></div>","PeriodicalId":66474,"journal":{"name":"Green Chemical Engineering","volume":"6 3","pages":"Pages 289-290"},"PeriodicalIF":7.6000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Critical evaluation of feature importance assessment in FFNN-based models for predicting Kamlet-Taft parameters\",\"authors\":\"Yoshiyasu Takefuji\",\"doi\":\"10.1016/j.gce.2025.01.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Mohan et al. developed a feed-forward neural network (FFNN) model to predict Kamlet-Taft parameters using quantum chemically derived features, achieving notable predictive accuracy. However, this study raises concerns about conflating prediction accuracy with feature importance accuracy, as high R<sup>2</sup> and low root mean square error (RMSE) do not guarantee valid feature importance assessments. The reliance on SHapley Additive exPlanations (SHAP) for feature evaluation is problematic due to model-specific biases that could misrepresent true associations. A broader understanding of data distribution, statistical relationships, and significance testing through p-values is essential to rectify this. This paper advocates for employing robust statistical methods, like Spearman's correlation, to effectively assess genuine associations and mitigate biases in feature importance analysis.</div></div>\",\"PeriodicalId\":66474,\"journal\":{\"name\":\"Green Chemical Engineering\",\"volume\":\"6 3\",\"pages\":\"Pages 289-290\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Green Chemical Engineering\",\"FirstCategoryId\":\"1089\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666952825000032\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Green Chemical Engineering","FirstCategoryId":"1089","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666952825000032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
Critical evaluation of feature importance assessment in FFNN-based models for predicting Kamlet-Taft parameters
Mohan et al. developed a feed-forward neural network (FFNN) model to predict Kamlet-Taft parameters using quantum chemically derived features, achieving notable predictive accuracy. However, this study raises concerns about conflating prediction accuracy with feature importance accuracy, as high R2 and low root mean square error (RMSE) do not guarantee valid feature importance assessments. The reliance on SHapley Additive exPlanations (SHAP) for feature evaluation is problematic due to model-specific biases that could misrepresent true associations. A broader understanding of data distribution, statistical relationships, and significance testing through p-values is essential to rectify this. This paper advocates for employing robust statistical methods, like Spearman's correlation, to effectively assess genuine associations and mitigate biases in feature importance analysis.