Improving materials property predictions for graph neural networks with minimal feature engineering

IF 6.3 2区物理与天体物理 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Science and Technology Pub Date : 2023-08-11 DOI:10.1088/2632-2153/acefab

Guoing Cong, Victor Fung

{"title":"Improving materials property predictions for graph neural networks with minimal feature engineering","authors":"Guoing Cong, Victor Fung","doi":"10.1088/2632-2153/acefab","DOIUrl":null,"url":null,"abstract":"Graph neural networks (GNNs) have been employed in materials research to predict physical and functional properties, and have achieved superior performance in several application domains over prior machine learning approaches. Recent studies incorporate features of increasing complexity such as Gaussian radial functions, plane wave functions, and angular terms to augment the neural network models, with the expectation that these features are critical for achieving a high performance. Here, we propose a GNN that adopts edge convolution where hidden edge features evolve during training and extensive attention mechanisms, and operates on simple graphs with atoms as nodes and distances between them as edges. As a result, the same model can be used for very different tasks as no other domain-specific features are used. With a model that uses no feature engineering, we achieve performance comparable with state-of-the-art models with elaborate features for formation energy and band gap prediction with standard benchmarks; we achieve even better performance when the dataset size increases. Although some domain-specific datasets still require hand-crafted features to achieve state-of-the-art results, our selected architecture choices greatly reduce the need for elaborate feature engineering and still maintain predictive power in comparison.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/acefab","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Graph neural networks (GNNs) have been employed in materials research to predict physical and functional properties, and have achieved superior performance in several application domains over prior machine learning approaches. Recent studies incorporate features of increasing complexity such as Gaussian radial functions, plane wave functions, and angular terms to augment the neural network models, with the expectation that these features are critical for achieving a high performance. Here, we propose a GNN that adopts edge convolution where hidden edge features evolve during training and extensive attention mechanisms, and operates on simple graphs with atoms as nodes and distances between them as edges. As a result, the same model can be used for very different tasks as no other domain-specific features are used. With a model that uses no feature engineering, we achieve performance comparable with state-of-the-art models with elaborate features for formation energy and band gap prediction with standard benchmarks; we achieve even better performance when the dataset size increases. Although some domain-specific datasets still require hand-crafted features to achieve state-of-the-art results, our selected architecture choices greatly reduce the need for elaborate feature engineering and still maintain predictive power in comparison.

查看原文本刊更多论文

用最小特征工程改进图神经网络材料性能预测

图神经网络（GNN）已被用于材料研究，以预测物理和功能特性，并在几个应用领域取得了优于现有机器学习方法的性能。最近的研究结合了越来越复杂的特征，如高斯径向函数、平面波函数和角度项，以增强神经网络模型，期望这些特征对实现高性能至关重要。在这里，我们提出了一种GNN，它采用边缘卷积，其中隐藏的边缘特征在训练和广泛注意力机制中进化，并在以原子为节点、以原子之间的距离为边的简单图上进行操作。因此，相同的模型可以用于非常不同的任务，因为没有使用其他领域特定的功能。使用不使用特征工程的模型，我们在地层能量和带隙预测方面实现了与具有精细特征的最先进模型相媲美的性能，并使用标准基准进行了预测；当数据集大小增加时，我们可以获得更好的性能。尽管一些特定领域的数据集仍然需要手工制作的功能来实现最先进的结果，但我们选择的架构大大减少了对精心设计的功能工程的需求，并且相比之下仍然保持了预测能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning Science and Technology Computer Science-Artificial Intelligence

CiteScore

9.10

自引率

4.40%

发文量

审稿时长

5 weeks

期刊介绍： Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.