{"title":"Interpretable Machine Learning for Benchmarking DFT Electron Affinity Predictions: A Generalized Additive Model Approach","authors":"Ismail Badran, Abdelrahman Eid, Motasem Far, Nadeen Abbas, Raghad Tayeh, Sahar Salman, Yasmeen Hamdan","doi":"10.1002/qua.70189","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Electron affinity (EA) is vital for understanding charge transfer, redox reactions, and material design across chemistry and materials science. This study introduces a novel methodology that integrates interpretable machine learning with density functional theory (DFT) to provide a guideline for EA predictions. Using Generalized Additive Models (GAM), we reveal how functionals, basis sets, and molecular species interact in complex and nonlinear ways, offering insights that are not accessible through conventional analyses. A curated dataset of 47 atoms and molecules was used to assess the predictive performance of nine functionals across five basis sets. The results show that basis set selection has a more pronounced effect on EA prediction accuracy than functional choice, with the “<i>augmented</i>” ma-def2-SVP basis set, particularly in combination with B3LYP, providing the most reliable predictions. In contrast, the “<i>unaugmented</i>” def2-TZVP and def2-SVP basis sets frequently introduced large errors and high variability. The GAM framework identified species-specific challenges, especially for Cl<sub>2</sub>, NH<sub>3</sub>, and CN, and demonstrated that although statistically significant differences exist among functionals, their performance in predicting EA values is largely equivalent from a chemical perspective. The GAM-based ranking showed a spread of 1.6 kJ/mol among the average predicted deviations of the tested functionals, indicating that their practical differences are small from a chemical perspective under the studied conditions. This interpretable machine learning approach therefore provides a transparent and reproducible strategy for guiding the selection of DFT methods for EA calculations. To our knowledge, this is the first study to apply interpretable ML to systematically investigate and challenge generalized assumptions in EA benchmarking, establishing a data-driven framework for practical decision-making in computational chemistry.</p>\n </div>","PeriodicalId":182,"journal":{"name":"International Journal of Quantum Chemistry","volume":"126 8","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2026-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Quantum Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/qua.70189","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Electron affinity (EA) is vital for understanding charge transfer, redox reactions, and material design across chemistry and materials science. This study introduces a novel methodology that integrates interpretable machine learning with density functional theory (DFT) to provide a guideline for EA predictions. Using Generalized Additive Models (GAM), we reveal how functionals, basis sets, and molecular species interact in complex and nonlinear ways, offering insights that are not accessible through conventional analyses. A curated dataset of 47 atoms and molecules was used to assess the predictive performance of nine functionals across five basis sets. The results show that basis set selection has a more pronounced effect on EA prediction accuracy than functional choice, with the “augmented” ma-def2-SVP basis set, particularly in combination with B3LYP, providing the most reliable predictions. In contrast, the “unaugmented” def2-TZVP and def2-SVP basis sets frequently introduced large errors and high variability. The GAM framework identified species-specific challenges, especially for Cl2, NH3, and CN, and demonstrated that although statistically significant differences exist among functionals, their performance in predicting EA values is largely equivalent from a chemical perspective. The GAM-based ranking showed a spread of 1.6 kJ/mol among the average predicted deviations of the tested functionals, indicating that their practical differences are small from a chemical perspective under the studied conditions. This interpretable machine learning approach therefore provides a transparent and reproducible strategy for guiding the selection of DFT methods for EA calculations. To our knowledge, this is the first study to apply interpretable ML to systematically investigate and challenge generalized assumptions in EA benchmarking, establishing a data-driven framework for practical decision-making in computational chemistry.
期刊介绍:
Since its first formulation quantum chemistry has provided the conceptual and terminological framework necessary to understand atoms, molecules and the condensed matter. Over the past decades synergistic advances in the methodological developments, software and hardware have transformed quantum chemistry in a truly interdisciplinary science that has expanded beyond its traditional core of molecular sciences to fields as diverse as chemistry and catalysis, biophysics, nanotechnology and material science.