{"title":"Machine Learning Assisted Material Discovery: A Small Data Approach","authors":"Qionghua Zhou, Xinyu Chen, Jinlan Wang","doi":"10.1021/accountsmr.1c00236","DOIUrl":null,"url":null,"abstract":"The data-driven paradigm, represented by the famous machine learning paradigm, is revolutionizing the way materials are discovered. The inductive nature of the data-driven approach gives it great speed of prediction but also brings with it a heavy reliance on material data. However, unlike its success with text and images, which are supported by big data, materials data tend to be small data. Building a large database of materials is a good solution but not a permanent one. The cost of materials data is much higher than that of text or images, and the size of the materials database at this stage is far from sufficient. We will continue to face a shortage of materials data for a long time to come, making small data approaches necessary for machine learning based materials discovery.","PeriodicalId":72040,"journal":{"name":"Accounts of materials research","volume":"21 1","pages":""},"PeriodicalIF":14.7000,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of materials research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1021/accountsmr.1c00236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The data-driven paradigm, represented by the famous machine learning paradigm, is revolutionizing the way materials are discovered. The inductive nature of the data-driven approach gives it great speed of prediction but also brings with it a heavy reliance on material data. However, unlike its success with text and images, which are supported by big data, materials data tend to be small data. Building a large database of materials is a good solution but not a permanent one. The cost of materials data is much higher than that of text or images, and the size of the materials database at this stage is far from sufficient. We will continue to face a shortage of materials data for a long time to come, making small data approaches necessary for machine learning based materials discovery.