James R. Damewood, Jessica Karaguesian, Jaclyn R. Lunger, Aik Rui Tan, M. Xie, Jiayu Peng, Rafael G'omez-Bombarelli
{"title":"Representations of Materials for Machine Learning","authors":"James R. Damewood, Jessica Karaguesian, Jaclyn R. Lunger, Aik Rui Tan, M. Xie, Jiayu Peng, Rafael G'omez-Bombarelli","doi":"10.1146/annurev-matsci-080921-085947","DOIUrl":null,"url":null,"abstract":"High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning the relations between composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by an ML model. Data sets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and properties of interest. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs for ML models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus require further investigation. Expected final online publication date for the Annual Review of Materials Research, Volume 53 is July 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.","PeriodicalId":8055,"journal":{"name":"Annual Review of Materials Research","volume":null,"pages":null},"PeriodicalIF":10.6000,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Review of Materials Research","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1146/annurev-matsci-080921-085947","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 8
Abstract
High-throughput data generation methods and machine learning (ML) algorithms have given rise to a new era of computational materials science by learning the relations between composition, structure, and properties and by exploiting such relations for design. However, to build these connections, materials data must be translated into a numerical form, called a representation, that can be processed by an ML model. Data sets in materials science vary in format (ranging from images to spectra), size, and fidelity. Predictive models vary in scope and properties of interest. Here, we review context-dependent strategies for constructing representations that enable the use of materials as inputs or outputs for ML models. Furthermore, we discuss how modern ML techniques can learn representations from data and transfer chemical and physical information between tasks. Finally, we outline high-impact questions that have not been fully resolved and thus require further investigation. Expected final online publication date for the Annual Review of Materials Research, Volume 53 is July 2023. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
期刊介绍:
The Annual Review of Materials Research, published since 1971, is a journal that covers significant developments in the field of materials research. It includes original methodologies, materials phenomena, material systems, and special keynote topics. The current volume of the journal has been converted from gated to open access through Annual Reviews' Subscribe to Open program, with all articles published under a CC BY license. The journal defines its scope as encompassing significant developments in materials science, including methodologies for studying materials and materials phenomena. It is indexed and abstracted in various databases, such as Scopus, Science Citation Index Expanded, Civil Engineering Abstracts, INSPEC, and Academic Search, among others.