Victor Trinquet, Matthew Evans, Cameron Hargreaves, Pierre-Paul De Breuck, Gian-Marco Rignanese
{"title":"Optical materials discovery and design via federated databases and machine learning","authors":"Victor Trinquet, Matthew Evans, Cameron Hargreaves, Pierre-Paul De Breuck, Gian-Marco Rignanese","doi":"10.1039/d4fd00092g","DOIUrl":null,"url":null,"abstract":"Combinatorial and guided screening of materials space with density-functional theory and related approaches has provided a wealth of hypothetical inorganic materials, which are increasingly tabulated in open databases. The OPTIMADE API is a standardised format for representing crystal structures, their measured and computed properties, and the methods for querying and filtering them from remote resources. Currently, the OPTIMADE federation spans over 20 data providers, rendering over 30 million structures accessible in this way, many of which are novel and have only recently been suggested by machine learning-based approaches. In this work, we outline our approach to non-exhaustively screen this dynamic trove of structures for the next-generation of optical materials. By applying MODNet, a neural network-based model for property prediction that has been shown to perform especially well for small materials datasets, within a combined active learning and high-throughput computation framework, we isolate particular structures and chemistries that should be most fruitful for further theoretical calculations and for experimental study as high-refractive-index materials. By making explicit use of automated calculations, federated dataset curation and machine learning, and by releasing these publicly, the workflows presented here can be periodically re-assessed as new databases implement OPTIMADE, and new hypothetical materials are suggested.","PeriodicalId":76,"journal":{"name":"Faraday Discussions","volume":"27 1","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Faraday Discussions","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4fd00092g","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Combinatorial and guided screening of materials space with density-functional theory and related approaches has provided a wealth of hypothetical inorganic materials, which are increasingly tabulated in open databases. The OPTIMADE API is a standardised format for representing crystal structures, their measured and computed properties, and the methods for querying and filtering them from remote resources. Currently, the OPTIMADE federation spans over 20 data providers, rendering over 30 million structures accessible in this way, many of which are novel and have only recently been suggested by machine learning-based approaches. In this work, we outline our approach to non-exhaustively screen this dynamic trove of structures for the next-generation of optical materials. By applying MODNet, a neural network-based model for property prediction that has been shown to perform especially well for small materials datasets, within a combined active learning and high-throughput computation framework, we isolate particular structures and chemistries that should be most fruitful for further theoretical calculations and for experimental study as high-refractive-index materials. By making explicit use of automated calculations, federated dataset curation and machine learning, and by releasing these publicly, the workflows presented here can be periodically re-assessed as new databases implement OPTIMADE, and new hypothetical materials are suggested.