晶体材料机器学习中材料表征与特征工程的集成:从局部到全局化学结构信息耦合

IF 27 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Bin Xiao, Yuchao Tang, Yi Liu
{"title":"晶体材料机器学习中材料表征与特征工程的集成:从局部到全局化学结构信息耦合","authors":"Bin Xiao,&nbsp;Yuchao Tang,&nbsp;Yi Liu","doi":"10.1002/wcms.70044","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Integrating materials representations into feature engineering by rational design plays a critical role in determining the capability and accuracy of material property prediction via machine learning (ML). There still exists a lack of comprehensive classification and multi-dimensional evaluation for many existing feature models that could guide model selection in applications and further development. This review systematically classifies feature construction methods for crystalline structures, emphasizing the coupling between chemical and structural information. We systematically discuss the geometric configurations, chemical attributes, and their intricate coupling mechanisms that can be leveraged for feature engineering. Furthermore, a comprehensive comparison is performed across multiple aspects including graph network representation, structural information embedding, chemistry-structure information coupling, local versus global characteristics, long-range versus short-range description, algorithm compatibility with kernel function method or deep neural network, data size requirements, computational complexity, and interpretability mechanisms, thereby highlighting key variations in existing feature models and improving the physical interpretability of predictive models. To illustrate the integration of multi-dimensional characteristics, the center-environment (CE) feature model is introduced based on the coupling between local chemical and structural information of physical core-shell structures. Within the CE model, the pre-attention mechanism reorients focus from intricate details within complex ML algorithms to explicit feature models that depict physical core-shell configurations. By minimizing data requirements while enhancing transparency in ML models, the CE feature provides a practical approach for developing efficient and accurate ML-based predictions tailored for small-data scenarios in materials science.</p>\n <p>This article is categorized under:\n\n </p><ul>\n \n <li>Structure and Mechanism &gt; Computational Materials Science</li>\n \n <li>Data Science &gt; Artificial Intelligence/Machine Learning</li>\n </ul>\n </div>","PeriodicalId":236,"journal":{"name":"Wiley Interdisciplinary Reviews: Computational Molecular Science","volume":"15 4","pages":""},"PeriodicalIF":27.0000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry-Structure Information Coupling\",\"authors\":\"Bin Xiao,&nbsp;Yuchao Tang,&nbsp;Yi Liu\",\"doi\":\"10.1002/wcms.70044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n <p>Integrating materials representations into feature engineering by rational design plays a critical role in determining the capability and accuracy of material property prediction via machine learning (ML). There still exists a lack of comprehensive classification and multi-dimensional evaluation for many existing feature models that could guide model selection in applications and further development. This review systematically classifies feature construction methods for crystalline structures, emphasizing the coupling between chemical and structural information. We systematically discuss the geometric configurations, chemical attributes, and their intricate coupling mechanisms that can be leveraged for feature engineering. Furthermore, a comprehensive comparison is performed across multiple aspects including graph network representation, structural information embedding, chemistry-structure information coupling, local versus global characteristics, long-range versus short-range description, algorithm compatibility with kernel function method or deep neural network, data size requirements, computational complexity, and interpretability mechanisms, thereby highlighting key variations in existing feature models and improving the physical interpretability of predictive models. To illustrate the integration of multi-dimensional characteristics, the center-environment (CE) feature model is introduced based on the coupling between local chemical and structural information of physical core-shell structures. Within the CE model, the pre-attention mechanism reorients focus from intricate details within complex ML algorithms to explicit feature models that depict physical core-shell configurations. By minimizing data requirements while enhancing transparency in ML models, the CE feature provides a practical approach for developing efficient and accurate ML-based predictions tailored for small-data scenarios in materials science.</p>\\n <p>This article is categorized under:\\n\\n </p><ul>\\n \\n <li>Structure and Mechanism &gt; Computational Materials Science</li>\\n \\n <li>Data Science &gt; Artificial Intelligence/Machine Learning</li>\\n </ul>\\n </div>\",\"PeriodicalId\":236,\"journal\":{\"name\":\"Wiley Interdisciplinary Reviews: Computational Molecular Science\",\"volume\":\"15 4\",\"pages\":\"\"},\"PeriodicalIF\":27.0000,\"publicationDate\":\"2025-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Wiley Interdisciplinary Reviews: Computational Molecular Science\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.70044\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Wiley Interdisciplinary Reviews: Computational Molecular Science","FirstCategoryId":"92","ListUrlMain":"https://wires.onlinelibrary.wiley.com/doi/10.1002/wcms.70044","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

通过合理设计将材料表征集成到特征工程中,对于通过机器学习(ML)确定材料属性预测的能力和准确性起着至关重要的作用。现有的许多特征模型仍然缺乏全面的分类和多维度的评价,无法指导应用中的模型选择和进一步的开发。本文系统地分类了晶体结构的特征构建方法,强调了化学信息与结构信息之间的耦合。我们系统地讨论了可以用于特征工程的几何构型、化学属性及其复杂的耦合机制。此外,还从多个方面进行了全面的比较,包括图网络表示、结构信息嵌入、化学-结构信息耦合、局部与全局特征、远程与短程描述、算法与核函数方法或深度神经网络的兼容性、数据大小要求、计算复杂性和可解释性机制。从而突出现有特征模型中的关键变化,并提高预测模型的物理可解释性。为了说明多维特征的集成,引入了基于物理核壳结构局部化学信息与结构信息耦合的中心环境特征模型。在CE模型中,预注意机制将焦点从复杂ML算法中的复杂细节重新定向到描述物理核壳配置的显式特征模型。通过最大限度地减少数据需求,同时提高机器学习模型的透明度,CE功能为开发针对材料科学小数据场景的高效准确的基于机器学习的预测提供了一种实用的方法。本文分为:结构与机理;计算材料科学数据科学人工智能/机器学习
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry-Structure Information Coupling

Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry-Structure Information Coupling

Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry-Structure Information Coupling

Integrating Materials Representations Into Feature Engineering in Machine Learning for Crystalline Materials: From Local to Global Chemistry-Structure Information Coupling

Integrating materials representations into feature engineering by rational design plays a critical role in determining the capability and accuracy of material property prediction via machine learning (ML). There still exists a lack of comprehensive classification and multi-dimensional evaluation for many existing feature models that could guide model selection in applications and further development. This review systematically classifies feature construction methods for crystalline structures, emphasizing the coupling between chemical and structural information. We systematically discuss the geometric configurations, chemical attributes, and their intricate coupling mechanisms that can be leveraged for feature engineering. Furthermore, a comprehensive comparison is performed across multiple aspects including graph network representation, structural information embedding, chemistry-structure information coupling, local versus global characteristics, long-range versus short-range description, algorithm compatibility with kernel function method or deep neural network, data size requirements, computational complexity, and interpretability mechanisms, thereby highlighting key variations in existing feature models and improving the physical interpretability of predictive models. To illustrate the integration of multi-dimensional characteristics, the center-environment (CE) feature model is introduced based on the coupling between local chemical and structural information of physical core-shell structures. Within the CE model, the pre-attention mechanism reorients focus from intricate details within complex ML algorithms to explicit feature models that depict physical core-shell configurations. By minimizing data requirements while enhancing transparency in ML models, the CE feature provides a practical approach for developing efficient and accurate ML-based predictions tailored for small-data scenarios in materials science.

This article is categorized under:

  • Structure and Mechanism > Computational Materials Science
  • Data Science > Artificial Intelligence/Machine Learning
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Wiley Interdisciplinary Reviews: Computational Molecular Science
Wiley Interdisciplinary Reviews: Computational Molecular Science CHEMISTRY, MULTIDISCIPLINARY-MATHEMATICAL & COMPUTATIONAL BIOLOGY
CiteScore
28.90
自引率
1.80%
发文量
52
审稿时长
6-12 weeks
期刊介绍: Computational molecular sciences harness the power of rigorous chemical and physical theories, employing computer-based modeling, specialized hardware, software development, algorithm design, and database management to explore and illuminate every facet of molecular sciences. These interdisciplinary approaches form a bridge between chemistry, biology, and materials sciences, establishing connections with adjacent application-driven fields in both chemistry and biology. WIREs Computational Molecular Science stands as a platform to comprehensively review and spotlight research from these dynamic and interconnected fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信