Interpretable Machine Learning for Benchmarking DFT Electron Affinity Predictions: A Generalized Additive Model Approach

IF 2 3区 化学 Q3 CHEMISTRY, PHYSICAL
Ismail Badran, Abdelrahman Eid, Motasem Far, Nadeen Abbas, Raghad Tayeh, Sahar Salman, Yasmeen Hamdan
{"title":"Interpretable Machine Learning for Benchmarking DFT Electron Affinity Predictions: A Generalized Additive Model Approach","authors":"Ismail Badran,&nbsp;Abdelrahman Eid,&nbsp;Motasem Far,&nbsp;Nadeen Abbas,&nbsp;Raghad Tayeh,&nbsp;Sahar Salman,&nbsp;Yasmeen Hamdan","doi":"10.1002/qua.70189","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Electron affinity (EA) is vital for understanding charge transfer, redox reactions, and material design across chemistry and materials science. This study introduces a novel methodology that integrates interpretable machine learning with density functional theory (DFT) to provide a guideline for EA predictions. Using Generalized Additive Models (GAM), we reveal how functionals, basis sets, and molecular species interact in complex and nonlinear ways, offering insights that are not accessible through conventional analyses. A curated dataset of 47 atoms and molecules was used to assess the predictive performance of nine functionals across five basis sets. The results show that basis set selection has a more pronounced effect on EA prediction accuracy than functional choice, with the “<i>augmented</i>” ma-def2-SVP basis set, particularly in combination with B3LYP, providing the most reliable predictions. In contrast, the “<i>unaugmented</i>” def2-TZVP and def2-SVP basis sets frequently introduced large errors and high variability. The GAM framework identified species-specific challenges, especially for Cl<sub>2</sub>, NH<sub>3</sub>, and CN, and demonstrated that although statistically significant differences exist among functionals, their performance in predicting EA values is largely equivalent from a chemical perspective. The GAM-based ranking showed a spread of 1.6 kJ/mol among the average predicted deviations of the tested functionals, indicating that their practical differences are small from a chemical perspective under the studied conditions. This interpretable machine learning approach therefore provides a transparent and reproducible strategy for guiding the selection of DFT methods for EA calculations. To our knowledge, this is the first study to apply interpretable ML to systematically investigate and challenge generalized assumptions in EA benchmarking, establishing a data-driven framework for practical decision-making in computational chemistry.</p>\n </div>","PeriodicalId":182,"journal":{"name":"International Journal of Quantum Chemistry","volume":"126 8","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Quantum Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/qua.70189","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Electron affinity (EA) is vital for understanding charge transfer, redox reactions, and material design across chemistry and materials science. This study introduces a novel methodology that integrates interpretable machine learning with density functional theory (DFT) to provide a guideline for EA predictions. Using Generalized Additive Models (GAM), we reveal how functionals, basis sets, and molecular species interact in complex and nonlinear ways, offering insights that are not accessible through conventional analyses. A curated dataset of 47 atoms and molecules was used to assess the predictive performance of nine functionals across five basis sets. The results show that basis set selection has a more pronounced effect on EA prediction accuracy than functional choice, with the “augmented” ma-def2-SVP basis set, particularly in combination with B3LYP, providing the most reliable predictions. In contrast, the “unaugmented” def2-TZVP and def2-SVP basis sets frequently introduced large errors and high variability. The GAM framework identified species-specific challenges, especially for Cl2, NH3, and CN, and demonstrated that although statistically significant differences exist among functionals, their performance in predicting EA values is largely equivalent from a chemical perspective. The GAM-based ranking showed a spread of 1.6 kJ/mol among the average predicted deviations of the tested functionals, indicating that their practical differences are small from a chemical perspective under the studied conditions. This interpretable machine learning approach therefore provides a transparent and reproducible strategy for guiding the selection of DFT methods for EA calculations. To our knowledge, this is the first study to apply interpretable ML to systematically investigate and challenge generalized assumptions in EA benchmarking, establishing a data-driven framework for practical decision-making in computational chemistry.

用于基准DFT电子亲和预测的可解释机器学习:一种广义加性模型方法
电子亲和(EA)对于理解电荷转移、氧化还原反应以及化学和材料科学中的材料设计至关重要。本研究引入了一种新的方法,将可解释机器学习与密度泛函理论(DFT)相结合,为EA预测提供指导。利用广义加性模型(GAM),我们揭示了泛函、基集和分子物种如何以复杂和非线性的方式相互作用,提供了通过传统分析无法获得的见解。一个由47个原子和分子组成的精心策划的数据集被用来评估5个基集中9个函数的预测性能。结果表明,与功能选择相比,基集选择对EA预测精度的影响更为显著,“增强”的ma-def2-SVP基集,特别是与B3LYP结合,提供了最可靠的预测。相反,“未增广”的def2-TZVP和def2-SVP基集经常引入大误差和高变异性。GAM框架确定了物种特异性的挑战,特别是Cl2、NH3和CN,并表明尽管在统计上存在显著差异,但从化学角度来看,它们在预测EA值方面的表现基本相同。基于gam的排序显示,测试功能的平均预测偏差相差1.6 kJ/mol,表明在研究条件下,它们的实际差异从化学角度来看很小。因此,这种可解释的机器学习方法为指导EA计算的DFT方法的选择提供了一个透明和可重复的策略。据我们所知,这是第一个应用可解释ML系统地调查和挑战EA基准中的广义假设的研究,为计算化学的实际决策建立了一个数据驱动的框架。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Quantum Chemistry
International Journal of Quantum Chemistry 化学-数学跨学科应用
CiteScore
4.70
自引率
4.50%
发文量
185
审稿时长
2 months
期刊介绍: Since its first formulation quantum chemistry has provided the conceptual and terminological framework necessary to understand atoms, molecules and the condensed matter. Over the past decades synergistic advances in the methodological developments, software and hardware have transformed quantum chemistry in a truly interdisciplinary science that has expanded beyond its traditional core of molecular sciences to fields as diverse as chemistry and catalysis, biophysics, nanotechnology and material science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书