Xiaochen Yang, Xun Zhang, Yujin Zhang, Jun Jiang, Wei Hu
{"title":"Deep Learning Protocol for Predicting Full-Spectrum Infrared and Raman Spectra of Polypeptides and Proteins Using All-Atom Models.","authors":"Xiaochen Yang, Xun Zhang, Yujin Zhang, Jun Jiang, Wei Hu","doi":"10.1021/acs.jpclett.5c00169","DOIUrl":null,"url":null,"abstract":"<p><p>Infrared (IR) spectroscopy and Raman spectroscopy are powerful tools for probing protein and peptide structures due to their capability to provide molecular fingerprints. As a popular spectral simulation method, the quantum chemistry (QC) calculation is usually hampered by the high computational cost and low efficiency. In this study, we developed a comprehensive data set of IR and Raman spectra for amino acids, dipeptides, and tripeptides. Using this data set, we applied transfer learning with DetaNet (a deep equivariant tensor attention network) to simulate full-spectrum IR and Raman spectra for large polypeptides and proteins. We have demonstrated that the transfer-learned DetaNet (TL-DetaNet) model successfully simulated the vibrational spectra of proteins with thousands of atoms, far exceeding traditional QC limitations. Additionally, TL-DetaNet achieved an efficiency that was 10<sup>3</sup>-10<sup>5</sup> times greater than that of QC methods. This work highlights the importance of data sets in machine learning and positions transfer learning as a transformative tool for large-scale biomolecular simulations, marking a substantial advancement in protein vibrational spectroscopy.</p>","PeriodicalId":62,"journal":{"name":"The Journal of Physical Chemistry Letters","volume":" ","pages":"2023-2028"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Physical Chemistry Letters","FirstCategoryId":"1","ListUrlMain":"https://doi.org/10.1021/acs.jpclett.5c00169","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/18 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared (IR) spectroscopy and Raman spectroscopy are powerful tools for probing protein and peptide structures due to their capability to provide molecular fingerprints. As a popular spectral simulation method, the quantum chemistry (QC) calculation is usually hampered by the high computational cost and low efficiency. In this study, we developed a comprehensive data set of IR and Raman spectra for amino acids, dipeptides, and tripeptides. Using this data set, we applied transfer learning with DetaNet (a deep equivariant tensor attention network) to simulate full-spectrum IR and Raman spectra for large polypeptides and proteins. We have demonstrated that the transfer-learned DetaNet (TL-DetaNet) model successfully simulated the vibrational spectra of proteins with thousands of atoms, far exceeding traditional QC limitations. Additionally, TL-DetaNet achieved an efficiency that was 103-105 times greater than that of QC methods. This work highlights the importance of data sets in machine learning and positions transfer learning as a transformative tool for large-scale biomolecular simulations, marking a substantial advancement in protein vibrational spectroscopy.
期刊介绍:
The Journal of Physical Chemistry (JPC) Letters is devoted to reporting new and original experimental and theoretical basic research of interest to physical chemists, biophysical chemists, chemical physicists, physicists, material scientists, and engineers. An important criterion for acceptance is that the paper reports a significant scientific advance and/or physical insight such that rapid publication is essential. Two issues of JPC Letters are published each month.