{"title":"追求化学键理论的极限描述力和通用性:从原子间电子结构破译 \"原子间基因组 \"的方法","authors":"Xinxu Zhang, Jiahao Wei, Hui Jia, Jiamin Liu, Guo Li, Ling Liu, Yulong Wu, Changlong Liu, Xiao-Dong Zhang, Yonghui Li","doi":"10.1021/acs.jctc.4c00557","DOIUrl":null,"url":null,"abstract":"The description and analysis of chemical bonds have been difficult following the popularization of electronic structure calculations. Although many attempts have been made from the perspective of electronic structure, the sheer volume of information in the electronic structure has left contemporary chemical bond analysis methods grappling with an inescapable “Trilemma” where the model briefness, generality, and descriptiveness (descriptive power) cannot be obtained simultaneously. To push the generality and descriptiveness to their extremes, herein a general machine learning-based framework is introduced to compact chemical bonds into a detailed residue-by-residue “genome” with matched encoding/decoding tools. The framework fuses the quantum mechanical aspects, auto feature extraction, nanostructures and/or simulations, and generative models. The encoded genomes are information-dense and decodable, where 100% generality is guaranteed. The descriptiveness of genomes appears to be broader than most known models. As a proof of concept, the realization presented in this work compacts the complete information regarding two critical chemical bonds in thiolate-protected gold nanoclusters, the S–Au and Au–Au bonds, from a Bosonic-Fermionic character perspective into 8-valued genomes. The machine learning component is trained based on 26,528 density functional theory simulated electron localization function images. With an exploration of the space span for the genome, bond polarization, hybridization, intrusion of other atoms, alignments, crystal orientation, atomic motions, and more details are observed. Furthermore, it has emerged from extensive generation tests that molecules and solids can be integrated in such a concise manner than is typically achieved with purely geometric representations. To showcase the intraclass complexity of S–Au and Au–Au bonds visually, a roadmap is plotted by summarizing and correlating the similarities of 8-value-genomes. Furthermore, genomes can be associated with realistic indices easily with a simple multilayer perception architecture as a simple calculating tool. Besides, there are 3 sets of applications, including a set of chemisorption, a set of molecular dynamical analysis, and a set of ultrafast processes, showcasing the interpretability potentials of interatomic genomes in the geometric structures, kinetic properties, and vibration characteristics of molecular systems. As the framework rose to the challenge of nanoclusters from a complicated mesoscopic family of material, the displayed generality and comprehensiveness indicate that the model may “understand” chemical bonds in a machine’s way.","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":null,"pages":null},"PeriodicalIF":5.7000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pursuing Extreme Descriptive Power and Generality in Chemical Bond Theories: A Method to Decipher “Interatomic Genomes” from Interatomic Electron Structures\",\"authors\":\"Xinxu Zhang, Jiahao Wei, Hui Jia, Jiamin Liu, Guo Li, Ling Liu, Yulong Wu, Changlong Liu, Xiao-Dong Zhang, Yonghui Li\",\"doi\":\"10.1021/acs.jctc.4c00557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The description and analysis of chemical bonds have been difficult following the popularization of electronic structure calculations. Although many attempts have been made from the perspective of electronic structure, the sheer volume of information in the electronic structure has left contemporary chemical bond analysis methods grappling with an inescapable “Trilemma” where the model briefness, generality, and descriptiveness (descriptive power) cannot be obtained simultaneously. To push the generality and descriptiveness to their extremes, herein a general machine learning-based framework is introduced to compact chemical bonds into a detailed residue-by-residue “genome” with matched encoding/decoding tools. The framework fuses the quantum mechanical aspects, auto feature extraction, nanostructures and/or simulations, and generative models. The encoded genomes are information-dense and decodable, where 100% generality is guaranteed. The descriptiveness of genomes appears to be broader than most known models. As a proof of concept, the realization presented in this work compacts the complete information regarding two critical chemical bonds in thiolate-protected gold nanoclusters, the S–Au and Au–Au bonds, from a Bosonic-Fermionic character perspective into 8-valued genomes. The machine learning component is trained based on 26,528 density functional theory simulated electron localization function images. With an exploration of the space span for the genome, bond polarization, hybridization, intrusion of other atoms, alignments, crystal orientation, atomic motions, and more details are observed. Furthermore, it has emerged from extensive generation tests that molecules and solids can be integrated in such a concise manner than is typically achieved with purely geometric representations. To showcase the intraclass complexity of S–Au and Au–Au bonds visually, a roadmap is plotted by summarizing and correlating the similarities of 8-value-genomes. Furthermore, genomes can be associated with realistic indices easily with a simple multilayer perception architecture as a simple calculating tool. Besides, there are 3 sets of applications, including a set of chemisorption, a set of molecular dynamical analysis, and a set of ultrafast processes, showcasing the interpretability potentials of interatomic genomes in the geometric structures, kinetic properties, and vibration characteristics of molecular systems. As the framework rose to the challenge of nanoclusters from a complicated mesoscopic family of material, the displayed generality and comprehensiveness indicate that the model may “understand” chemical bonds in a machine’s way.\",\"PeriodicalId\":45,\"journal\":{\"name\":\"Journal of Chemical Theory and Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Theory and Computation\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://doi.org/10.1021/acs.jctc.4c00557\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, PHYSICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.4c00557","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
Pursuing Extreme Descriptive Power and Generality in Chemical Bond Theories: A Method to Decipher “Interatomic Genomes” from Interatomic Electron Structures
The description and analysis of chemical bonds have been difficult following the popularization of electronic structure calculations. Although many attempts have been made from the perspective of electronic structure, the sheer volume of information in the electronic structure has left contemporary chemical bond analysis methods grappling with an inescapable “Trilemma” where the model briefness, generality, and descriptiveness (descriptive power) cannot be obtained simultaneously. To push the generality and descriptiveness to their extremes, herein a general machine learning-based framework is introduced to compact chemical bonds into a detailed residue-by-residue “genome” with matched encoding/decoding tools. The framework fuses the quantum mechanical aspects, auto feature extraction, nanostructures and/or simulations, and generative models. The encoded genomes are information-dense and decodable, where 100% generality is guaranteed. The descriptiveness of genomes appears to be broader than most known models. As a proof of concept, the realization presented in this work compacts the complete information regarding two critical chemical bonds in thiolate-protected gold nanoclusters, the S–Au and Au–Au bonds, from a Bosonic-Fermionic character perspective into 8-valued genomes. The machine learning component is trained based on 26,528 density functional theory simulated electron localization function images. With an exploration of the space span for the genome, bond polarization, hybridization, intrusion of other atoms, alignments, crystal orientation, atomic motions, and more details are observed. Furthermore, it has emerged from extensive generation tests that molecules and solids can be integrated in such a concise manner than is typically achieved with purely geometric representations. To showcase the intraclass complexity of S–Au and Au–Au bonds visually, a roadmap is plotted by summarizing and correlating the similarities of 8-value-genomes. Furthermore, genomes can be associated with realistic indices easily with a simple multilayer perception architecture as a simple calculating tool. Besides, there are 3 sets of applications, including a set of chemisorption, a set of molecular dynamical analysis, and a set of ultrafast processes, showcasing the interpretability potentials of interatomic genomes in the geometric structures, kinetic properties, and vibration characteristics of molecular systems. As the framework rose to the challenge of nanoclusters from a complicated mesoscopic family of material, the displayed generality and comprehensiveness indicate that the model may “understand” chemical bonds in a machine’s way.
期刊介绍:
The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.