{"title":"Leveraging deep learning for accurate and automated interpretation of molecular IR and Raman spectra","authors":"Tianqing Hu , Yujin Zhang , Wei Hu","doi":"10.1016/j.pnsc.2025.02.014","DOIUrl":null,"url":null,"abstract":"<div><div>Evaluating molecular and material properties is a fundamental aspect of science with significant practical applications. While these properties can be inferred from spectra, traditional methods require meticulous calibration and rely heavily on the expertise and intuition of researchers. In this study, we evaluated four machine learning models — convolutional neural networks (CNN), multilayer perceptrons (MLP), Transformers, and gated recurrent units (GRU) — to predict key molecular properties, including the number of atoms, HOMO-LUMO gap, dipole moment, aromaticity, and first excitation energy. Using the QM9S dataset that includes IR and Raman spectra of 127468 molecules as a benchmark, we found that combining IR and Raman spectra consistently outperformed individual spectra in prediction accuracy. By incorporating data augmentation and model fusion, we further improved prediction accuracy by 2 %–7 % for all molecular properties. Finally, we proposed the best single or fused deep learning model to predict the number of atoms, HOMO-LUMO gap, dipole moment, aromaticity and first excitation energy with the accuracy of 99 %, 96 %, 83 %, 99 % and 97 %, respectively. The present work offers an automated and precise interpretation of IR and Raman spectra, enabling accurate predictions of several critical molecular properties.</div></div>","PeriodicalId":20742,"journal":{"name":"Progress in Natural Science: Materials International","volume":"35 3","pages":"Pages 505-512"},"PeriodicalIF":7.1000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Progress in Natural Science: Materials International","FirstCategoryId":"88","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1002007125000292","RegionNum":2,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Evaluating molecular and material properties is a fundamental aspect of science with significant practical applications. While these properties can be inferred from spectra, traditional methods require meticulous calibration and rely heavily on the expertise and intuition of researchers. In this study, we evaluated four machine learning models — convolutional neural networks (CNN), multilayer perceptrons (MLP), Transformers, and gated recurrent units (GRU) — to predict key molecular properties, including the number of atoms, HOMO-LUMO gap, dipole moment, aromaticity, and first excitation energy. Using the QM9S dataset that includes IR and Raman spectra of 127468 molecules as a benchmark, we found that combining IR and Raman spectra consistently outperformed individual spectra in prediction accuracy. By incorporating data augmentation and model fusion, we further improved prediction accuracy by 2 %–7 % for all molecular properties. Finally, we proposed the best single or fused deep learning model to predict the number of atoms, HOMO-LUMO gap, dipole moment, aromaticity and first excitation energy with the accuracy of 99 %, 96 %, 83 %, 99 % and 97 %, respectively. The present work offers an automated and precise interpretation of IR and Raman spectra, enabling accurate predictions of several critical molecular properties.
期刊介绍:
Progress in Natural Science: Materials International provides scientists and engineers throughout the world with a central vehicle for the exchange and dissemination of basic theoretical studies and applied research of advanced materials. The emphasis is placed on original research, both analytical and experimental, which is of permanent interest to engineers and scientists, covering all aspects of new materials and technologies, such as, energy and environmental materials; advanced structural materials; advanced transportation materials, functional and electronic materials; nano-scale and amorphous materials; health and biological materials; materials modeling and simulation; materials characterization; and so on. The latest research achievements and innovative papers in basic theoretical studies and applied research of material science will be carefully selected and promptly reported. Thus, the aim of this Journal is to serve the global materials science and technology community with the latest research findings.
As a service to readers, an international bibliography of recent publications in advanced materials is published bimonthly.