基于巧妙的机器学习方法合理设计热电材料

IF 5.3 2区 材料科学 Q2 MATERIALS SCIENCE, MULTIDISCIPLINARY
Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan
{"title":"基于巧妙的机器学习方法合理设计热电材料","authors":"Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan","doi":"10.1002/aelm.202500210","DOIUrl":null,"url":null,"abstract":"Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (<i>ZT</i>) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality <i>ZT</i> prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (<i>R</i><sup>2</sup> = 0.95 on the training set and <i>R</i><sup>2</sup> = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr<sub>3</sub> and TlBSe<sub>3</sub> in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr<sub>3</sub> under p-type doping (<i>ZT</i><sub><i>max</i></sub> = 1.64) and the bipolar thermoelectric characteristics of TlBSe<sub>3</sub> (<i>ZT</i><sub><i>max</i></sub> = 1.04 for n-type and <i>ZT</i><sub><i>max</i></sub> = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.","PeriodicalId":110,"journal":{"name":"Advanced Electronic Materials","volume":"1 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rationally Design Thermoelectric Materials Based on Ingenious Machine Learning Methods\",\"authors\":\"Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan\",\"doi\":\"10.1002/aelm.202500210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (<i>ZT</i>) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality <i>ZT</i> prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (<i>R</i><sup>2</sup> = 0.95 on the training set and <i>R</i><sup>2</sup> = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr<sub>3</sub> and TlBSe<sub>3</sub> in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr<sub>3</sub> under p-type doping (<i>ZT</i><sub><i>max</i></sub> = 1.64) and the bipolar thermoelectric characteristics of TlBSe<sub>3</sub> (<i>ZT</i><sub><i>max</i></sub> = 1.04 for n-type and <i>ZT</i><sub><i>max</i></sub> = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.\",\"PeriodicalId\":110,\"journal\":{\"name\":\"Advanced Electronic Materials\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Electronic Materials\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://doi.org/10.1002/aelm.202500210\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Electronic Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1002/aelm.202500210","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

数据质量、特征可解释性和模型泛化是在高效材料设计中应用机器学习(ML)的关键和挑战。在这项工作中,构建了一个集成多步特征工程的机器学习框架,用于预测热电材料的品质系数(ZT)。通过整合Starrydata2数据库中的热电材料数据,并进行严格的数据清洗,建立了高质量的ZT预测数据集。采用Magpie和CBFV相结合的综合特征提取策略,通过Pearson相关分析和LassoCV交叉验证进行特征选择。最后,深度神经网络模型(model - i)表现出优异的预测性能(在训练集上R2 = 0.95,在测试集上R2 = 0.90),并在筛选硫系钙钛矿和卤化物钙钛矿方面成功识别出CsCdBr3和TlBSe3等有前景的候选物。结合密度泛函理论(DFT)计算,成功证明了CsCdBr3在p型掺杂下优异的热电性能(ZTmax = 1.64)和TlBSe3在800K时的双极热电特性(n型ZTmax = 1.04, p型ZTmax = 0.99),进一步证实了我们方法的可靠性。该研究为功能性材料设计提供了一种应用数据驱动的方法,平衡了预测准确性和物理可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Rationally Design Thermoelectric Materials Based on Ingenious Machine Learning Methods

Rationally Design Thermoelectric Materials Based on Ingenious Machine Learning Methods
Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (ZT) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality ZT prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (R2 = 0.95 on the training set and R2 = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr3 and TlBSe3 in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr3 under p-type doping (ZTmax = 1.64) and the bipolar thermoelectric characteristics of TlBSe3 (ZTmax = 1.04 for n-type and ZTmax = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Advanced Electronic Materials
Advanced Electronic Materials NANOSCIENCE & NANOTECHNOLOGYMATERIALS SCIE-MATERIALS SCIENCE, MULTIDISCIPLINARY
CiteScore
11.00
自引率
3.20%
发文量
433
期刊介绍: Advanced Electronic Materials is an interdisciplinary forum for peer-reviewed, high-quality, high-impact research in the fields of materials science, physics, and engineering of electronic and magnetic materials. It includes research on physics and physical properties of electronic and magnetic materials, spintronics, electronics, device physics and engineering, micro- and nano-electromechanical systems, and organic electronics, in addition to fundamental research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信