Machine learning prediction of bandgap and formation energy in two-dimensional metal oxides

IF 2.8 3区 物理与天体物理 Q2 PHYSICS, CONDENSED MATTER
Wen Yao , Wanli Jia , Ruofan Shen , Jiayao Wang , Lin Zhang , Xinmei Wang
{"title":"Machine learning prediction of bandgap and formation energy in two-dimensional metal oxides","authors":"Wen Yao ,&nbsp;Wanli Jia ,&nbsp;Ruofan Shen ,&nbsp;Jiayao Wang ,&nbsp;Lin Zhang ,&nbsp;Xinmei Wang","doi":"10.1016/j.physb.2025.417821","DOIUrl":null,"url":null,"abstract":"<div><div>Two-dimensional (2D) transition metal oxides (TMOs) including perovskite oxides with tunable band gaps offer promising opportunities in optoelectronics, energy storage, catalysis, and sensing applications. In this work, we propose a machine learning (ML)-based framework for the accurate prediction and analysis of the band gap and formation energy of 2D TMOs. A comprehensive feature engineering strategy was employed to construct 120 physical descriptors, followed by feature selection using Pearson correlation coefficients and feature importance rankings. We evaluated seven machine learning algorithms across six prediction tasks encompassing various material types, scales, and target properties. Among them, eXtreme Gradient Boosting (XGBoost) and Gradient Boosting Decision Tree (GBDT)—implemented via Gradient Boosting Classifier for classification tasks and Gradient Boosting Regressor for regression tasks—consistently exhibited superior performance. In the classification of electronic band types, XGBoost achieved an accuracy of 95.4 %, while the Gradient Boosting Classifier reached 92.3 %. For the regression prediction of band gaps and formation energies, both XGBoost and Gradient Boosting Regressor attained coefficients of determination (R<sup>2</sup>) close to 0.90. Furthermore, SHapley Additive exPlanations (SHAP) analysis provided interpretability by identifying dominant features influencing each property. The bandgap was primarily governed by the average number of d-orbital valence electrons, the proportion of s-orbital valence electrons, oxygen content (variable only in 2D oxides), and average atomic mass. In contrast, formation energy exhibited strong correlations with the electronegativity range, oxygen content in 2D oxides, and average d-orbital valence electron count. This study offers a robust and interpretable predictive approach for accelerating the screening and rational design of 2D TMOs, potentially reducing computational costs in high-throughput materials discovery workflows.</div></div>","PeriodicalId":20116,"journal":{"name":"Physica B-condensed Matter","volume":"717 ","pages":"Article 417821"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica B-condensed Matter","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092145262500938X","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, CONDENSED MATTER","Score":null,"Total":0}
引用次数: 0

Abstract

Two-dimensional (2D) transition metal oxides (TMOs) including perovskite oxides with tunable band gaps offer promising opportunities in optoelectronics, energy storage, catalysis, and sensing applications. In this work, we propose a machine learning (ML)-based framework for the accurate prediction and analysis of the band gap and formation energy of 2D TMOs. A comprehensive feature engineering strategy was employed to construct 120 physical descriptors, followed by feature selection using Pearson correlation coefficients and feature importance rankings. We evaluated seven machine learning algorithms across six prediction tasks encompassing various material types, scales, and target properties. Among them, eXtreme Gradient Boosting (XGBoost) and Gradient Boosting Decision Tree (GBDT)—implemented via Gradient Boosting Classifier for classification tasks and Gradient Boosting Regressor for regression tasks—consistently exhibited superior performance. In the classification of electronic band types, XGBoost achieved an accuracy of 95.4 %, while the Gradient Boosting Classifier reached 92.3 %. For the regression prediction of band gaps and formation energies, both XGBoost and Gradient Boosting Regressor attained coefficients of determination (R2) close to 0.90. Furthermore, SHapley Additive exPlanations (SHAP) analysis provided interpretability by identifying dominant features influencing each property. The bandgap was primarily governed by the average number of d-orbital valence electrons, the proportion of s-orbital valence electrons, oxygen content (variable only in 2D oxides), and average atomic mass. In contrast, formation energy exhibited strong correlations with the electronegativity range, oxygen content in 2D oxides, and average d-orbital valence electron count. This study offers a robust and interpretable predictive approach for accelerating the screening and rational design of 2D TMOs, potentially reducing computational costs in high-throughput materials discovery workflows.
二维金属氧化物中带隙和形成能的机器学习预测
二维(2D)过渡金属氧化物(TMOs),包括具有可调带隙的钙钛矿氧化物,在光电子学、储能、催化和传感应用中提供了很好的机会。在这项工作中,我们提出了一个基于机器学习(ML)的框架,用于准确预测和分析二维TMOs的带隙和形成能。采用综合特征工程策略构建120个物理描述符,然后利用Pearson相关系数和特征重要性排序进行特征选择。我们评估了六种预测任务中的七种机器学习算法,包括各种材料类型、尺度和目标属性。其中,极端梯度提升(eXtreme Gradient Boosting, XGBoost)和梯度提升决策树(Gradient Boosting Decision Tree, GBDT)——通过梯度提升分类器实现分类任务,梯度提升回归器实现回归任务——始终表现出优异的性能。在电子波段类型的分类中,XGBoost的准确率达到95.4%,而Gradient Boosting Classifier的准确率达到92.3%。对于带隙和地层能的回归预测,XGBoost和Gradient Boosting Regressor的决定系数(R2)均接近0.90。此外,SHapley加性解释(SHAP)分析通过确定影响每个属性的主要特征提供了可解释性。带隙主要由d轨道价电子的平均数目、s轨道价电子的比例、氧含量(仅在二维氧化物中可变)和平均原子质量决定。相反,形成能与电负性范围、二维氧化物中的氧含量和平均d轨道价电子数有很强的相关性。该研究为加速二维TMOs的筛选和合理设计提供了一种强大且可解释的预测方法,有可能降低高通量材料发现工作流程中的计算成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Physica B-condensed Matter
Physica B-condensed Matter 物理-物理:凝聚态物理
CiteScore
4.90
自引率
7.10%
发文量
703
审稿时长
44 days
期刊介绍: Physica B: Condensed Matter comprises all condensed matter and material physics that involve theoretical, computational and experimental work. Papers should contain further developments and a proper discussion on the physics of experimental or theoretical results in one of the following areas: -Magnetism -Materials physics -Nanostructures and nanomaterials -Optics and optical materials -Quantum materials -Semiconductors -Strongly correlated systems -Superconductivity -Surfaces and interfaces
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信