Tianqing Hu, Zihan Zou, Bo Li, Tong Zhu, Shaonan Gu*, Jun Jiang*, Yi Luo* and Wei Hu*,
{"title":"分子结构与振动谱双向转换的深度学习。","authors":"Tianqing Hu, Zihan Zou, Bo Li, Tong Zhu, Shaonan Gu*, Jun Jiang*, Yi Luo* and Wei Hu*, ","doi":"10.1021/jacs.5c05010","DOIUrl":null,"url":null,"abstract":"<p >Two deep learning models, TranSpec and SpecGNN, were developed to establish a bidirectional mapping between molecular vibrational spectra and simplified molecular input line entry system (SMILES) representations, akin to a “translation” between the language of spectra and the language of molecular structures. Initially, TranSpec achieved accuracy rates of 55 and 63% for quantum chemistry (QC)-calculated IR and Raman spectral data sets, respectively, but its performance dropped to 11% for the NIST experimental IR data set. To address this, we combined IR and Raman spectra as input; augmented the data set; employed model fusion, transfer learning, and multisource learning; applied molecular mass filtering; and leveraged SpecGNN for spectral simulation and candidate reordering. These improvements boosted TranSpec’s accuracy to 53.6% for the experimental IR data set. Notably, SpecGNN outperformed traditional QC methods in terms of both spectral accuracy and computational efficiency. Finally, we demonstrated TranSpec’s ability to recognize functional groups and distinguish isomers or homologues. Together, TranSpec and SpecGNN models provide an efficient and accurate AI-driven framework for interpreting molecular structures and spectra, advancing applications in spectroscopy and cheminformatics.</p>","PeriodicalId":49,"journal":{"name":"Journal of the American Chemical Society","volume":"147 31","pages":"27525–27536"},"PeriodicalIF":15.6000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep Learning for Bidirectional Translation between Molecular Structures and Vibrational Spectra\",\"authors\":\"Tianqing Hu, Zihan Zou, Bo Li, Tong Zhu, Shaonan Gu*, Jun Jiang*, Yi Luo* and Wei Hu*, \",\"doi\":\"10.1021/jacs.5c05010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Two deep learning models, TranSpec and SpecGNN, were developed to establish a bidirectional mapping between molecular vibrational spectra and simplified molecular input line entry system (SMILES) representations, akin to a “translation” between the language of spectra and the language of molecular structures. Initially, TranSpec achieved accuracy rates of 55 and 63% for quantum chemistry (QC)-calculated IR and Raman spectral data sets, respectively, but its performance dropped to 11% for the NIST experimental IR data set. To address this, we combined IR and Raman spectra as input; augmented the data set; employed model fusion, transfer learning, and multisource learning; applied molecular mass filtering; and leveraged SpecGNN for spectral simulation and candidate reordering. These improvements boosted TranSpec’s accuracy to 53.6% for the experimental IR data set. Notably, SpecGNN outperformed traditional QC methods in terms of both spectral accuracy and computational efficiency. Finally, we demonstrated TranSpec’s ability to recognize functional groups and distinguish isomers or homologues. Together, TranSpec and SpecGNN models provide an efficient and accurate AI-driven framework for interpreting molecular structures and spectra, advancing applications in spectroscopy and cheminformatics.</p>\",\"PeriodicalId\":49,\"journal\":{\"name\":\"Journal of the American Chemical Society\",\"volume\":\"147 31\",\"pages\":\"27525–27536\"},\"PeriodicalIF\":15.6000,\"publicationDate\":\"2025-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Chemical Society\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/jacs.5c05010\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Chemical Society","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/jacs.5c05010","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Deep Learning for Bidirectional Translation between Molecular Structures and Vibrational Spectra
Two deep learning models, TranSpec and SpecGNN, were developed to establish a bidirectional mapping between molecular vibrational spectra and simplified molecular input line entry system (SMILES) representations, akin to a “translation” between the language of spectra and the language of molecular structures. Initially, TranSpec achieved accuracy rates of 55 and 63% for quantum chemistry (QC)-calculated IR and Raman spectral data sets, respectively, but its performance dropped to 11% for the NIST experimental IR data set. To address this, we combined IR and Raman spectra as input; augmented the data set; employed model fusion, transfer learning, and multisource learning; applied molecular mass filtering; and leveraged SpecGNN for spectral simulation and candidate reordering. These improvements boosted TranSpec’s accuracy to 53.6% for the experimental IR data set. Notably, SpecGNN outperformed traditional QC methods in terms of both spectral accuracy and computational efficiency. Finally, we demonstrated TranSpec’s ability to recognize functional groups and distinguish isomers or homologues. Together, TranSpec and SpecGNN models provide an efficient and accurate AI-driven framework for interpreting molecular structures and spectra, advancing applications in spectroscopy and cheminformatics.
期刊介绍:
The flagship journal of the American Chemical Society, known as the Journal of the American Chemical Society (JACS), has been a prestigious publication since its establishment in 1879. It holds a preeminent position in the field of chemistry and related interdisciplinary sciences. JACS is committed to disseminating cutting-edge research papers, covering a wide range of topics, and encompasses approximately 19,000 pages of Articles, Communications, and Perspectives annually. With a weekly publication frequency, JACS plays a vital role in advancing the field of chemistry by providing essential research.