Hengrui Zhang, Alexandru B. Georgescu, Suraj Yerramilli, Christopher Karpovich, Daniel W. Apley, Elsa A. Olivetti, James M. Rondinelli* and Wei Chen*,
{"title":"新兴的微电子材料设计:利用稀缺和分散的数据导航组合设计空间","authors":"Hengrui Zhang, Alexandru B. Georgescu, Suraj Yerramilli, Christopher Karpovich, Daniel W. Apley, Elsa A. Olivetti, James M. Rondinelli* and Wei Chen*, ","doi":"10.1021/accountsmr.5c00011","DOIUrl":null,"url":null,"abstract":"<p >The increasing demands of sustainable energy, electronics, and biomedical applications call for next-generation functional materials with unprecedented properties. Of particular interest are emerging materials that display exceptional physical properties, making them promising candidates for energy-efficient microelectronic devices. As the conventional Edisonian approach becomes significantly outpaced by growing societal needs, emerging computational modeling and machine learning methods have been employed for the rational design of materials. However, the complex physical mechanisms, cost of first-principles calculations, and the dispersity and scarcity of data pose challenges to both physics-based and data-driven materials modeling. Moreover, the combinatorial composition–structure design space is high-dimensional and often disjoint, making design optimization nontrivial.</p><p >In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate material design. We begin by presenting our integrated material design framework and its three components in a general context. (1) Using text mining and natural language processing techniques, our framework first extracts and organizes relevant information dispersed in the literature. (2) From this initial database of relevant materials, data-driven models can be trained and subsequently employed to perform virtual screening of the unknown materials space. This virtual screening process can identify promising materials families for further investigation, thus narrowing down the candidate space. (3) Within the identified materials families, a Bayesian optimization-based adaptive discovery workflow is applied to search for materials with optimal properties. To extend the capability of Bayesian optimization, which was previously restricted to small data and numerical variables, we developed a family of uncertainty-aware machine learning methods for mixed numerical and categorical variables.</p><p >We then provide an example of applying this materials design framework to metal–insulator transition (MIT) materials, a specific type of emerging material with practical importance in next-generation memory technologies. We identify multiple new materials that may display this property in the lacunar spinel and Ruddlesden–Popper perovskite families and propose pathways for their synthesis. The classifiers used to identify new possible MIT materials also identified previously unknown features that may be used for predictive theory for this class of materials. For example, we have identified descriptors derived from ionicity and atom sizes as indicators to MIT behavior.</p><p >Finally, we identify some outstanding challenges in data-driven materials design, such as material data quality issues, property–performance mismatch, and validation and deployment. We seek to raise awareness of these overlooked issues hindering material design, thus stimulating efforts toward developing methods to mitigate the gaps.</p>","PeriodicalId":72040,"journal":{"name":"Accounts of materials research","volume":"6 6","pages":"730–741"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data\",\"authors\":\"Hengrui Zhang, Alexandru B. Georgescu, Suraj Yerramilli, Christopher Karpovich, Daniel W. Apley, Elsa A. Olivetti, James M. Rondinelli* and Wei Chen*, \",\"doi\":\"10.1021/accountsmr.5c00011\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >The increasing demands of sustainable energy, electronics, and biomedical applications call for next-generation functional materials with unprecedented properties. Of particular interest are emerging materials that display exceptional physical properties, making them promising candidates for energy-efficient microelectronic devices. As the conventional Edisonian approach becomes significantly outpaced by growing societal needs, emerging computational modeling and machine learning methods have been employed for the rational design of materials. However, the complex physical mechanisms, cost of first-principles calculations, and the dispersity and scarcity of data pose challenges to both physics-based and data-driven materials modeling. Moreover, the combinatorial composition–structure design space is high-dimensional and often disjoint, making design optimization nontrivial.</p><p >In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate material design. We begin by presenting our integrated material design framework and its three components in a general context. (1) Using text mining and natural language processing techniques, our framework first extracts and organizes relevant information dispersed in the literature. (2) From this initial database of relevant materials, data-driven models can be trained and subsequently employed to perform virtual screening of the unknown materials space. This virtual screening process can identify promising materials families for further investigation, thus narrowing down the candidate space. (3) Within the identified materials families, a Bayesian optimization-based adaptive discovery workflow is applied to search for materials with optimal properties. To extend the capability of Bayesian optimization, which was previously restricted to small data and numerical variables, we developed a family of uncertainty-aware machine learning methods for mixed numerical and categorical variables.</p><p >We then provide an example of applying this materials design framework to metal–insulator transition (MIT) materials, a specific type of emerging material with practical importance in next-generation memory technologies. We identify multiple new materials that may display this property in the lacunar spinel and Ruddlesden–Popper perovskite families and propose pathways for their synthesis. The classifiers used to identify new possible MIT materials also identified previously unknown features that may be used for predictive theory for this class of materials. For example, we have identified descriptors derived from ionicity and atom sizes as indicators to MIT behavior.</p><p >Finally, we identify some outstanding challenges in data-driven materials design, such as material data quality issues, property–performance mismatch, and validation and deployment. We seek to raise awareness of these overlooked issues hindering material design, thus stimulating efforts toward developing methods to mitigate the gaps.</p>\",\"PeriodicalId\":72040,\"journal\":{\"name\":\"Accounts of materials research\",\"volume\":\"6 6\",\"pages\":\"730–741\"},\"PeriodicalIF\":14.7000,\"publicationDate\":\"2025-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of materials research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/accountsmr.5c00011\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of materials research","FirstCategoryId":"1085","ListUrlMain":"https://pubs.acs.org/doi/10.1021/accountsmr.5c00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data
The increasing demands of sustainable energy, electronics, and biomedical applications call for next-generation functional materials with unprecedented properties. Of particular interest are emerging materials that display exceptional physical properties, making them promising candidates for energy-efficient microelectronic devices. As the conventional Edisonian approach becomes significantly outpaced by growing societal needs, emerging computational modeling and machine learning methods have been employed for the rational design of materials. However, the complex physical mechanisms, cost of first-principles calculations, and the dispersity and scarcity of data pose challenges to both physics-based and data-driven materials modeling. Moreover, the combinatorial composition–structure design space is high-dimensional and often disjoint, making design optimization nontrivial.
In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate material design. We begin by presenting our integrated material design framework and its three components in a general context. (1) Using text mining and natural language processing techniques, our framework first extracts and organizes relevant information dispersed in the literature. (2) From this initial database of relevant materials, data-driven models can be trained and subsequently employed to perform virtual screening of the unknown materials space. This virtual screening process can identify promising materials families for further investigation, thus narrowing down the candidate space. (3) Within the identified materials families, a Bayesian optimization-based adaptive discovery workflow is applied to search for materials with optimal properties. To extend the capability of Bayesian optimization, which was previously restricted to small data and numerical variables, we developed a family of uncertainty-aware machine learning methods for mixed numerical and categorical variables.
We then provide an example of applying this materials design framework to metal–insulator transition (MIT) materials, a specific type of emerging material with practical importance in next-generation memory technologies. We identify multiple new materials that may display this property in the lacunar spinel and Ruddlesden–Popper perovskite families and propose pathways for their synthesis. The classifiers used to identify new possible MIT materials also identified previously unknown features that may be used for predictive theory for this class of materials. For example, we have identified descriptors derived from ionicity and atom sizes as indicators to MIT behavior.
Finally, we identify some outstanding challenges in data-driven materials design, such as material data quality issues, property–performance mismatch, and validation and deployment. We seek to raise awareness of these overlooked issues hindering material design, thus stimulating efforts toward developing methods to mitigate the gaps.