{"title":"领域知识对专业机器学习模型属性预测的影响","authors":"Lin Wang, Tanjin He and Bin Ouyang*, ","doi":"10.1021/acsmaterialslett.5c00726","DOIUrl":null,"url":null,"abstract":"<p >Developing transferable machine learning models is trending in data-driven materials research. However, how to apply such models to a specific research domain remains unclear. In this work, we choose high-entropy materials as a platform with a specialized data set containing 145,323 DFT-relaxed materials. This data set is used to explore the role of domain-specific knowledge in training effective models. Our tests with three representative graph neural network architectures indicate the model complexity has much smaller influence on performance than the data itself. Specifically, the consideration of low-energy atomic ordering, structures with diverse elemental coverage, and high-order interactions significantly influences the model performance. We also find that domain knowledge-driven sampling can greatly enhance unsupervised learning techniques. This research highlights that developing specialized data sets is more beneficial than further complicating deep learning architectures. Additionally, physics-inspired sampling algorithms are crucially needed for better machine learning models for a specific materials research domain.</p>","PeriodicalId":19,"journal":{"name":"ACS Materials Letters","volume":"7 8","pages":"2708–2715"},"PeriodicalIF":8.7000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Impact of Domain Knowledge on the Property Prediction of Specialized Machine Learning Models\",\"authors\":\"Lin Wang, Tanjin He and Bin Ouyang*, \",\"doi\":\"10.1021/acsmaterialslett.5c00726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Developing transferable machine learning models is trending in data-driven materials research. However, how to apply such models to a specific research domain remains unclear. In this work, we choose high-entropy materials as a platform with a specialized data set containing 145,323 DFT-relaxed materials. This data set is used to explore the role of domain-specific knowledge in training effective models. Our tests with three representative graph neural network architectures indicate the model complexity has much smaller influence on performance than the data itself. Specifically, the consideration of low-energy atomic ordering, structures with diverse elemental coverage, and high-order interactions significantly influences the model performance. We also find that domain knowledge-driven sampling can greatly enhance unsupervised learning techniques. This research highlights that developing specialized data sets is more beneficial than further complicating deep learning architectures. Additionally, physics-inspired sampling algorithms are crucially needed for better machine learning models for a specific materials research domain.</p>\",\"PeriodicalId\":19,\"journal\":{\"name\":\"ACS Materials Letters\",\"volume\":\"7 8\",\"pages\":\"2708–2715\"},\"PeriodicalIF\":8.7000,\"publicationDate\":\"2025-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Materials Letters\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acsmaterialslett.5c00726\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Materials Letters","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acsmaterialslett.5c00726","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Impact of Domain Knowledge on the Property Prediction of Specialized Machine Learning Models
Developing transferable machine learning models is trending in data-driven materials research. However, how to apply such models to a specific research domain remains unclear. In this work, we choose high-entropy materials as a platform with a specialized data set containing 145,323 DFT-relaxed materials. This data set is used to explore the role of domain-specific knowledge in training effective models. Our tests with three representative graph neural network architectures indicate the model complexity has much smaller influence on performance than the data itself. Specifically, the consideration of low-energy atomic ordering, structures with diverse elemental coverage, and high-order interactions significantly influences the model performance. We also find that domain knowledge-driven sampling can greatly enhance unsupervised learning techniques. This research highlights that developing specialized data sets is more beneficial than further complicating deep learning architectures. Additionally, physics-inspired sampling algorithms are crucially needed for better machine learning models for a specific materials research domain.
期刊介绍:
ACS Materials Letters is a journal that publishes high-quality and urgent papers at the forefront of fundamental and applied research in the field of materials science. It aims to bridge the gap between materials and other disciplines such as chemistry, engineering, and biology. The journal encourages multidisciplinary and innovative research that addresses global challenges. Papers submitted to ACS Materials Letters should clearly demonstrate the need for rapid disclosure of key results. The journal is interested in various areas including the design, synthesis, characterization, and evaluation of emerging materials, understanding the relationships between structure, property, and performance, as well as developing materials for applications in energy, environment, biomedical, electronics, and catalysis. The journal has a 2-year impact factor of 11.4 and is dedicated to publishing transformative materials research with fast processing times. The editors and staff of ACS Materials Letters actively participate in major scientific conferences and engage closely with readers and authors. The journal also maintains an active presence on social media to provide authors with greater visibility.