利用有限材料数据进行机器学习的最新进展:使用数据科学和领域知识工具

IF 8.4 1区 材料科学 Q1 CHEMISTRY, PHYSICAL
Bangtan Zong, Jinshan Li, Tinghuan Yuan, Jun Wang, Ruihao Yuan
{"title":"利用有限材料数据进行机器学习的最新进展:使用数据科学和领域知识工具","authors":"Bangtan Zong,&nbsp;Jinshan Li,&nbsp;Tinghuan Yuan,&nbsp;Jun Wang,&nbsp;Ruihao Yuan","doi":"10.1016/j.jmat.2024.07.002","DOIUrl":null,"url":null,"abstract":"<div><div>One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space. We review the recent progress on the use of tools from data science and domain knowledge to mitigate the issues arising from limited materials data. The enhancement of data quality and amount <em>via</em> data augmentation and feature engineering is first summarized and discussed. Then the strategies that use ensemble model and transfer learning for improved machine learning model are overviewed. Next, we move to the active learning with emphasis on the uncertainty quantification and evaluation. Subsequently, the merits of the combination of domain knowledge and machine learning are stressed. Finally, we discuss some applications of large language models in the field of materials science. We summarize this review by posing the challenges and opportunities in the field of machine learning for small material data.</div></div>","PeriodicalId":16173,"journal":{"name":"Journal of Materiomics","volume":"11 3","pages":"Article 100916"},"PeriodicalIF":8.4000,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recent progress on machine learning with limited materials data: Using tools from data science and domain knowledge\",\"authors\":\"Bangtan Zong,&nbsp;Jinshan Li,&nbsp;Tinghuan Yuan,&nbsp;Jun Wang,&nbsp;Ruihao Yuan\",\"doi\":\"10.1016/j.jmat.2024.07.002\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space. We review the recent progress on the use of tools from data science and domain knowledge to mitigate the issues arising from limited materials data. The enhancement of data quality and amount <em>via</em> data augmentation and feature engineering is first summarized and discussed. Then the strategies that use ensemble model and transfer learning for improved machine learning model are overviewed. Next, we move to the active learning with emphasis on the uncertainty quantification and evaluation. Subsequently, the merits of the combination of domain knowledge and machine learning are stressed. Finally, we discuss some applications of large language models in the field of materials science. We summarize this review by posing the challenges and opportunities in the field of machine learning for small material data.</div></div>\",\"PeriodicalId\":16173,\"journal\":{\"name\":\"Journal of Materiomics\",\"volume\":\"11 3\",\"pages\":\"Article 100916\"},\"PeriodicalIF\":8.4000,\"publicationDate\":\"2024-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Materiomics\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352847824001552\",\"RegionNum\":1,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Materiomics","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352847824001552","RegionNum":1,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0

摘要

本文章由计算机程序翻译,如有差异,请以英文原文为准。

Recent progress on machine learning with limited materials data: Using tools from data science and domain knowledge

Recent progress on machine learning with limited materials data: Using tools from data science and domain knowledge
One key challenge in materials informatics is how to effectively use the material data of small size to search for desired materials from a huge unexplored material space. We review the recent progress on the use of tools from data science and domain knowledge to mitigate the issues arising from limited materials data. The enhancement of data quality and amount via data augmentation and feature engineering is first summarized and discussed. Then the strategies that use ensemble model and transfer learning for improved machine learning model are overviewed. Next, we move to the active learning with emphasis on the uncertainty quantification and evaluation. Subsequently, the merits of the combination of domain knowledge and machine learning are stressed. Finally, we discuss some applications of large language models in the field of materials science. We summarize this review by posing the challenges and opportunities in the field of machine learning for small material data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Materiomics
Journal of Materiomics Materials Science-Metals and Alloys
CiteScore
14.30
自引率
6.40%
发文量
331
审稿时长
37 days
期刊介绍: The Journal of Materiomics is a peer-reviewed open-access journal that aims to serve as a forum for the continuous dissemination of research within the field of materials science. It particularly emphasizes systematic studies on the relationships between composition, processing, structure, property, and performance of advanced materials. The journal is supported by the Chinese Ceramic Society and is indexed in SCIE and Scopus. It is commonly referred to as J Materiomics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信