Gopal R. Iyer , Shashikant Kumar , Edgar Josué Landinez Borda , Babak Sadigh , Sebastien Hamel , Vasily Bulatov , Vincenzo Lordi , Amit Samanta
{"title":"利用多体相关描述符层次结构预测电荷密度的可解释、可扩展线性和符号回归模型","authors":"Gopal R. Iyer , Shashikant Kumar , Edgar Josué Landinez Borda , Babak Sadigh , Sebastien Hamel , Vasily Bulatov , Vincenzo Lordi , Amit Samanta","doi":"10.1016/j.commatsci.2024.113433","DOIUrl":null,"url":null,"abstract":"<div><div>Density functional theory (DFT) is routinely used to make electronic structure predictions for high-throughput screening of materials and molecules for technologically relevant areas, like the identification of better catalysts, electronic materials, and drug discovery. However, the DFT formalism is limited by (a) its poor (quadratic-to-quartic) scaling, and (b) the need to perform repeated eigenvalue computations of the electronic Hamiltonian as part of its self-consistent field (SCF) iteration procedure to obtain the converged ground state electron density, <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span>. Approaches that directly predict <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> of a structure with high accuracy can accelerate conventional SCF calculations and can also be used in linearly scaling methods such as orbital-free DFT. To this end, we present a procedure to predict the ground state electron density of molecular and periodic three-dimensional systems directly from the atomic structure with a particular emphasis on physical interpretability. In our framework, <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> is modeled using many-body correlation descriptors that accurately capture the effects of local atomic arrangements in the neighborhood of a grid point. Our use of a linear regression scheme to fit to charge density data enables transparent analysis of the relative contributions of various types of local atomic correlations. By systematically including increasingly complex correlations, our model is shown to accurately predict <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> for a variety of chemically and electronically diverse systems — amorphous Ge, Al(001) slab, crystalline <span><math><mrow><msub><mrow><mi>Ga</mi></mrow><mrow><mn>2</mn></mrow></msub><msub><mrow><mi>O</mi></mrow><mrow><mn>3</mn></mrow></msub></mrow></math></span>, molecular benzene, and polyethylene. We then demonstrate a symbolic regression-based protocol to construct easily computable, interpretable features from lower-order correlations that significantly improves our electron density predictions with effectively no increase in the computational cost.</div></div>","PeriodicalId":10650,"journal":{"name":"Computational Materials Science","volume":"246 ","pages":"Article 113433"},"PeriodicalIF":3.1000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Interpretable, extensible linear and symbolic regression models for charge density prediction using a hierarchy of many-body correlation descriptors\",\"authors\":\"Gopal R. Iyer , Shashikant Kumar , Edgar Josué Landinez Borda , Babak Sadigh , Sebastien Hamel , Vasily Bulatov , Vincenzo Lordi , Amit Samanta\",\"doi\":\"10.1016/j.commatsci.2024.113433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Density functional theory (DFT) is routinely used to make electronic structure predictions for high-throughput screening of materials and molecules for technologically relevant areas, like the identification of better catalysts, electronic materials, and drug discovery. However, the DFT formalism is limited by (a) its poor (quadratic-to-quartic) scaling, and (b) the need to perform repeated eigenvalue computations of the electronic Hamiltonian as part of its self-consistent field (SCF) iteration procedure to obtain the converged ground state electron density, <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span>. Approaches that directly predict <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> of a structure with high accuracy can accelerate conventional SCF calculations and can also be used in linearly scaling methods such as orbital-free DFT. To this end, we present a procedure to predict the ground state electron density of molecular and periodic three-dimensional systems directly from the atomic structure with a particular emphasis on physical interpretability. In our framework, <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> is modeled using many-body correlation descriptors that accurately capture the effects of local atomic arrangements in the neighborhood of a grid point. Our use of a linear regression scheme to fit to charge density data enables transparent analysis of the relative contributions of various types of local atomic correlations. By systematically including increasingly complex correlations, our model is shown to accurately predict <span><math><mrow><mi>ρ</mi><mfenced><mrow><mi>r</mi></mrow></mfenced></mrow></math></span> for a variety of chemically and electronically diverse systems — amorphous Ge, Al(001) slab, crystalline <span><math><mrow><msub><mrow><mi>Ga</mi></mrow><mrow><mn>2</mn></mrow></msub><msub><mrow><mi>O</mi></mrow><mrow><mn>3</mn></mrow></msub></mrow></math></span>, molecular benzene, and polyethylene. We then demonstrate a symbolic regression-based protocol to construct easily computable, interpretable features from lower-order correlations that significantly improves our electron density predictions with effectively no increase in the computational cost.</div></div>\",\"PeriodicalId\":10650,\"journal\":{\"name\":\"Computational Materials Science\",\"volume\":\"246 \",\"pages\":\"Article 113433\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-10-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Materials Science\",\"FirstCategoryId\":\"88\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0927025624006542\",\"RegionNum\":3,\"RegionCategory\":\"材料科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Materials Science","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927025624006542","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
Interpretable, extensible linear and symbolic regression models for charge density prediction using a hierarchy of many-body correlation descriptors
Density functional theory (DFT) is routinely used to make electronic structure predictions for high-throughput screening of materials and molecules for technologically relevant areas, like the identification of better catalysts, electronic materials, and drug discovery. However, the DFT formalism is limited by (a) its poor (quadratic-to-quartic) scaling, and (b) the need to perform repeated eigenvalue computations of the electronic Hamiltonian as part of its self-consistent field (SCF) iteration procedure to obtain the converged ground state electron density, . Approaches that directly predict of a structure with high accuracy can accelerate conventional SCF calculations and can also be used in linearly scaling methods such as orbital-free DFT. To this end, we present a procedure to predict the ground state electron density of molecular and periodic three-dimensional systems directly from the atomic structure with a particular emphasis on physical interpretability. In our framework, is modeled using many-body correlation descriptors that accurately capture the effects of local atomic arrangements in the neighborhood of a grid point. Our use of a linear regression scheme to fit to charge density data enables transparent analysis of the relative contributions of various types of local atomic correlations. By systematically including increasingly complex correlations, our model is shown to accurately predict for a variety of chemically and electronically diverse systems — amorphous Ge, Al(001) slab, crystalline , molecular benzene, and polyethylene. We then demonstrate a symbolic regression-based protocol to construct easily computable, interpretable features from lower-order correlations that significantly improves our electron density predictions with effectively no increase in the computational cost.
期刊介绍:
The goal of Computational Materials Science is to report on results that provide new or unique insights into, or significantly expand our understanding of, the properties of materials or phenomena associated with their design, synthesis, processing, characterization, and utilization. To be relevant to the journal, the results should be applied or applicable to specific material systems that are discussed within the submission.