{"title":"Improving materials property predictions for graph neural networks with minimal feature engineering","authors":"Guoing Cong, Victor Fung","doi":"10.1088/2632-2153/acefab","DOIUrl":null,"url":null,"abstract":"Graph neural networks (GNNs) have been employed in materials research to predict physical and functional properties, and have achieved superior performance in several application domains over prior machine learning approaches. Recent studies incorporate features of increasing complexity such as Gaussian radial functions, plane wave functions, and angular terms to augment the neural network models, with the expectation that these features are critical for achieving a high performance. Here, we propose a GNN that adopts edge convolution where hidden edge features evolve during training and extensive attention mechanisms, and operates on simple graphs with atoms as nodes and distances between them as edges. As a result, the same model can be used for very different tasks as no other domain-specific features are used. With a model that uses no feature engineering, we achieve performance comparable with state-of-the-art models with elaborate features for formation energy and band gap prediction with standard benchmarks; we achieve even better performance when the dataset size increases. Although some domain-specific datasets still require hand-crafted features to achieve state-of-the-art results, our selected architecture choices greatly reduce the need for elaborate feature engineering and still maintain predictive power in comparison.","PeriodicalId":33757,"journal":{"name":"Machine Learning Science and Technology","volume":" ","pages":""},"PeriodicalIF":6.3000,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning Science and Technology","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1088/2632-2153/acefab","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Graph neural networks (GNNs) have been employed in materials research to predict physical and functional properties, and have achieved superior performance in several application domains over prior machine learning approaches. Recent studies incorporate features of increasing complexity such as Gaussian radial functions, plane wave functions, and angular terms to augment the neural network models, with the expectation that these features are critical for achieving a high performance. Here, we propose a GNN that adopts edge convolution where hidden edge features evolve during training and extensive attention mechanisms, and operates on simple graphs with atoms as nodes and distances between them as edges. As a result, the same model can be used for very different tasks as no other domain-specific features are used. With a model that uses no feature engineering, we achieve performance comparable with state-of-the-art models with elaborate features for formation energy and band gap prediction with standard benchmarks; we achieve even better performance when the dataset size increases. Although some domain-specific datasets still require hand-crafted features to achieve state-of-the-art results, our selected architecture choices greatly reduce the need for elaborate feature engineering and still maintain predictive power in comparison.
期刊介绍:
Machine Learning Science and Technology is a multidisciplinary open access journal that bridges the application of machine learning across the sciences with advances in machine learning methods and theory as motivated by physical insights. Specifically, articles must fall into one of the following categories: advance the state of machine learning-driven applications in the sciences or make conceptual, methodological or theoretical advances in machine learning with applications to, inspiration from, or motivated by scientific problems.