{"title":"Developing an improved crystal graph convolutional neural network framework for accelerated materials discovery","authors":"Cheol Woo Park, C. Wolverton","doi":"10.1103/PHYSREVMATERIALS.4.063801","DOIUrl":null,"url":null,"abstract":"The recently proposed crystal graph convolutional neural network (CGCNN) offers a highly versatile and accurate machine learning (ML) framework by learning material properties directly from graph-like representations of crystal structures (\"crystal graphs\"). Here, we develop an improved variant of the CGCNN model (iCGCNN) that outperforms the original by incorporating information of the Voronoi tessellated crystal structure, explicit 3-body correlations of neighboring constituent atoms, and an optimized chemical representation of interatomic bonds in the crystal graphs. We demonstrate the accuracy of the improved framework in two distinct illustrations: First, when trained/validated on 180,000/20,000 density functional theory (DFT) calculated thermodynamic stability entries taken from the Open Quantum Materials Database (OQMD) and evaluated on a separate test set of 230,000 entries, iCGCNN achieves a predictive accuracy that is significantly improved, i.e., 20% higher than that of the original CGCNN. Second, when used to assist high-throughput search for materials in the ThCr2Si2 structure-type, iCGCNN exhibited a success rate of 31% which is 310 times higher than an undirected high-throughput search and 2.4 times higher than that of the original CGCNN. Using both CGCNN and iCGCNN, we screened 132,600 compounds with elemental decorations of the ThCr2Si2 prototype crystal structure and identified a total of 97 new unique stable compounds by performing 757 DFT calculations, accelerating the computational time of the high-throughput search by a factor of 130. Our results suggest that the iCGCNN can be used to accelerate high-throughput discoveries of new materials by quickly and accurately identifying crystalline compounds with properties of interest.","PeriodicalId":8424,"journal":{"name":"arXiv: Computational Physics","volume":"46 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"141","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv: Computational Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1103/PHYSREVMATERIALS.4.063801","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 141
Abstract
The recently proposed crystal graph convolutional neural network (CGCNN) offers a highly versatile and accurate machine learning (ML) framework by learning material properties directly from graph-like representations of crystal structures ("crystal graphs"). Here, we develop an improved variant of the CGCNN model (iCGCNN) that outperforms the original by incorporating information of the Voronoi tessellated crystal structure, explicit 3-body correlations of neighboring constituent atoms, and an optimized chemical representation of interatomic bonds in the crystal graphs. We demonstrate the accuracy of the improved framework in two distinct illustrations: First, when trained/validated on 180,000/20,000 density functional theory (DFT) calculated thermodynamic stability entries taken from the Open Quantum Materials Database (OQMD) and evaluated on a separate test set of 230,000 entries, iCGCNN achieves a predictive accuracy that is significantly improved, i.e., 20% higher than that of the original CGCNN. Second, when used to assist high-throughput search for materials in the ThCr2Si2 structure-type, iCGCNN exhibited a success rate of 31% which is 310 times higher than an undirected high-throughput search and 2.4 times higher than that of the original CGCNN. Using both CGCNN and iCGCNN, we screened 132,600 compounds with elemental decorations of the ThCr2Si2 prototype crystal structure and identified a total of 97 new unique stable compounds by performing 757 DFT calculations, accelerating the computational time of the high-throughput search by a factor of 130. Our results suggest that the iCGCNN can be used to accelerate high-throughput discoveries of new materials by quickly and accurately identifying crystalline compounds with properties of interest.