Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan
{"title":"基于巧妙的机器学习方法合理设计热电材料","authors":"Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan","doi":"10.1002/aelm.202500210","DOIUrl":null,"url":null,"abstract":"Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (<i>ZT</i>) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality <i>ZT</i> prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (<i>R</i><sup>2</sup> = 0.95 on the training set and <i>R</i><sup>2</sup> = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr<sub>3</sub> and TlBSe<sub>3</sub> in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr<sub>3</sub> under p-type doping (<i>ZT</i><sub><i>max</i></sub> = 1.64) and the bipolar thermoelectric characteristics of TlBSe<sub>3</sub> (<i>ZT</i><sub><i>max</i></sub> = 1.04 for n-type and <i>ZT</i><sub><i>max</i></sub> = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.","PeriodicalId":110,"journal":{"name":"Advanced Electronic Materials","volume":"1 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Rationally Design Thermoelectric Materials Based on Ingenious Machine Learning Methods\",\"authors\":\"Yuqing Sun, Xiaorui Chen, Jianzhi Gao, Wenliang Zhu, Minghu Pan\",\"doi\":\"10.1002/aelm.202500210\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (<i>ZT</i>) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality <i>ZT</i> prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (<i>R</i><sup>2</sup> = 0.95 on the training set and <i>R</i><sup>2</sup> = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr<sub>3</sub> and TlBSe<sub>3</sub> in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr<sub>3</sub> under p-type doping (<i>ZT</i><sub><i>max</i></sub> = 1.64) and the bipolar thermoelectric characteristics of TlBSe<sub>3</sub> (<i>ZT</i><sub><i>max</i></sub> = 1.04 for n-type and <i>ZT</i><sub><i>max</i></sub> = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.\",\"PeriodicalId\":110,\"journal\":{\"name\":\"Advanced Electronic Materials\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advanced Electronic Materials\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://doi.org/10.1002/aelm.202500210\",\"RegionNum\":2,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Electronic Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1002/aelm.202500210","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Rationally Design Thermoelectric Materials Based on Ingenious Machine Learning Methods
Data quality, feature interpretability, and model generalization are critical and challenging for applying machine learning (ML) in the design of high-efficiency materials. In this work, an ML framework with integrating multi-step feature engineering is constructed for predicting the figure of merit (ZT) values of thermoelectric materials. By incorporating thermoelectric material data from the Starrydata2 database and implementing rigorous data cleaning, a high-quality ZT prediction dataset is established. An integrated strategy of feature extraction with combining Magpie and CBFV methods is utilized, followed by feature selection via Pearson correlation analysis and LassoCV cross-validation. Finally, the deep neural network model (Model-I) demonstrates excellent predictive performance (R2 = 0.95 on the training set and R2 = 0.90 on the test set), as well as identified successfully promising candidates such as CsCdBr3 and TlBSe3 in screening chalcogenide and halide perovskites. Combined with Density Functional Theory (DFT) calculation, the outstanding thermoelectric performance of CsCdBr3 under p-type doping (ZTmax = 1.64) and the bipolar thermoelectric characteristics of TlBSe3 (ZTmax = 1.04 for n-type and ZTmax = 0.99 for p-type) at 800K are successfully demonstrated, further confirming the reliability of our method. This study provides an applicative data-driven approach for functional material design, balancing predictive accuracy and physical interpretability.
期刊介绍:
Advanced Electronic Materials is an interdisciplinary forum for peer-reviewed, high-quality, high-impact research in the fields of materials science, physics, and engineering of electronic and magnetic materials. It includes research on physics and physical properties of electronic and magnetic materials, spintronics, electronics, device physics and engineering, micro- and nano-electromechanical systems, and organic electronics, in addition to fundamental research.