{"title":"Leveraging Persistent Homology Features for Accurate Defect Formation Energy Predictions via Graph Neural Networks","authors":"Zhenyao Fang, Qimin Yan","doi":"10.1021/acs.chemmater.4c03028","DOIUrl":null,"url":null,"abstract":"In machine-learning-assisted high-throughput defect studies, a defect-aware latent representation of the supercell structure is crucial for the accurate prediction of defect properties. The performance of current graph neural network (GNN) models is limited due to the fact that defect properties depend strongly on the local atomic configurations near the defect sites and due to the oversmoothing problem of GNN. Herein, we demonstrate that persistent homology features, which encode the topological information on the local chemical environment around each atomic site, can characterize the structural information on defects. Using the dataset containing a wide spectrum of O-based perovskites with all available vacancies as an example, we show that incorporating the persistent homology features, along with proper choices of graph pooling operations, significantly increases the prediction accuracy, with the MAE reduced by 55%. Those features can be easily integrated into the state-of-the-art GNN models, including the graph Transformer network and the equivariant neural network, and universally improve their performance. Besides, our model also overcomes the convergence issue with respect to the supercell size that was present in previous GNN models. Furthermore, using the datasets of defective BaTiO<sub>3</sub> with multiple substitutions and multiple vacancies as examples, our GNN model can also predict the defect–defect interactions accurately. These results suggest that persistent homology features can effectively improve the performance of machine learning models and assist the accelerated discovery of functional defects for technological applications.","PeriodicalId":33,"journal":{"name":"Chemistry of Materials","volume":"11 1","pages":""},"PeriodicalIF":7.0000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemistry of Materials","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1021/acs.chemmater.4c03028","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
In machine-learning-assisted high-throughput defect studies, a defect-aware latent representation of the supercell structure is crucial for the accurate prediction of defect properties. The performance of current graph neural network (GNN) models is limited due to the fact that defect properties depend strongly on the local atomic configurations near the defect sites and due to the oversmoothing problem of GNN. Herein, we demonstrate that persistent homology features, which encode the topological information on the local chemical environment around each atomic site, can characterize the structural information on defects. Using the dataset containing a wide spectrum of O-based perovskites with all available vacancies as an example, we show that incorporating the persistent homology features, along with proper choices of graph pooling operations, significantly increases the prediction accuracy, with the MAE reduced by 55%. Those features can be easily integrated into the state-of-the-art GNN models, including the graph Transformer network and the equivariant neural network, and universally improve their performance. Besides, our model also overcomes the convergence issue with respect to the supercell size that was present in previous GNN models. Furthermore, using the datasets of defective BaTiO3 with multiple substitutions and multiple vacancies as examples, our GNN model can also predict the defect–defect interactions accurately. These results suggest that persistent homology features can effectively improve the performance of machine learning models and assist the accelerated discovery of functional defects for technological applications.
期刊介绍:
The journal Chemistry of Materials focuses on publishing original research at the intersection of materials science and chemistry. The studies published in the journal involve chemistry as a prominent component and explore topics such as the design, synthesis, characterization, processing, understanding, and application of functional or potentially functional materials. The journal covers various areas of interest, including inorganic and organic solid-state chemistry, nanomaterials, biomaterials, thin films and polymers, and composite/hybrid materials. The journal particularly seeks papers that highlight the creation or development of innovative materials with novel optical, electrical, magnetic, catalytic, or mechanical properties. It is essential that manuscripts on these topics have a primary focus on the chemistry of materials and represent a significant advancement compared to prior research. Before external reviews are sought, submitted manuscripts undergo a review process by a minimum of two editors to ensure their appropriateness for the journal and the presence of sufficient evidence of a significant advance that will be of broad interest to the materials chemistry community.