Using machine learning to map simulated noisy and laser-limited multidimensional spectra to molecular electronic couplings†

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY
Jonathan D. Schultz, Kelsey A. Parker, Bashir Sbaiti and David N. Beratan
{"title":"Using machine learning to map simulated noisy and laser-limited multidimensional spectra to molecular electronic couplings†","authors":"Jonathan D. Schultz, Kelsey A. Parker, Bashir Sbaiti and David N. Beratan","doi":"10.1039/D5DD00125K","DOIUrl":null,"url":null,"abstract":"<p >Two-dimensional electronic spectroscopy (2DES) has enabled significant discoveries in both biological and synthetic energy-transducing systems. Although deriving chemical information from 2DES is a complex task, machine learning (ML) offers exciting opportunities to translate complicated spectroscopic data into physical insight. Recent studies have found that neural networks (NNs) can map simulated multidimensional spectra to molecular-scale properties with high accuracy. However, simulations often do not capture experimental factors that influence real spectra, including noise and suboptimal pulse resonance conditions, bringing into question the experimental utility of NNs trained on simulated data. Here, we show how factors associated with experimental 2D spectral data influence the ability of NNs to map simulated 2DES spectra onto underlying intermolecular electronic couplings. By systematically introducing multisourced noise into a library of 356 000 simulated 2D spectra, we show that noise does not hamper NN performance for spectra exceeding threshold signal-to-noise ratios (SNR) of <em>ca.</em> 12.4, 2.5, and 5.1 if uncorrelated additive, correlated additive, or intensity-dependent noise sources dominate, respectively. In stark contrast to human-based analyses of 2DES data, we find that the NN accuracy improves significantly (<em>ca.</em> 84% → 96%) when the data are constrained by the bandwidth and center frequency of the pump pulses. This result is consistent with the NN learning the optical trends described by Kasha's theory of molecular excitons. Our findings convey positive prospects for adapting simulation-trained NNs to extract molecular properties from inherently imperfect experimental 2DES data. More broadly, we propose that machine-learned perspectives of nonlinear spectroscopic data may produce unique and perhaps counterintuitive guidelines for experimental design.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 7","pages":" 1912-1924"},"PeriodicalIF":6.2000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2025/dd/d5dd00125k?page=search","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2025/dd/d5dd00125k","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Two-dimensional electronic spectroscopy (2DES) has enabled significant discoveries in both biological and synthetic energy-transducing systems. Although deriving chemical information from 2DES is a complex task, machine learning (ML) offers exciting opportunities to translate complicated spectroscopic data into physical insight. Recent studies have found that neural networks (NNs) can map simulated multidimensional spectra to molecular-scale properties with high accuracy. However, simulations often do not capture experimental factors that influence real spectra, including noise and suboptimal pulse resonance conditions, bringing into question the experimental utility of NNs trained on simulated data. Here, we show how factors associated with experimental 2D spectral data influence the ability of NNs to map simulated 2DES spectra onto underlying intermolecular electronic couplings. By systematically introducing multisourced noise into a library of 356 000 simulated 2D spectra, we show that noise does not hamper NN performance for spectra exceeding threshold signal-to-noise ratios (SNR) of ca. 12.4, 2.5, and 5.1 if uncorrelated additive, correlated additive, or intensity-dependent noise sources dominate, respectively. In stark contrast to human-based analyses of 2DES data, we find that the NN accuracy improves significantly (ca. 84% → 96%) when the data are constrained by the bandwidth and center frequency of the pump pulses. This result is consistent with the NN learning the optical trends described by Kasha's theory of molecular excitons. Our findings convey positive prospects for adapting simulation-trained NNs to extract molecular properties from inherently imperfect experimental 2DES data. More broadly, we propose that machine-learned perspectives of nonlinear spectroscopic data may produce unique and perhaps counterintuitive guidelines for experimental design.

Abstract Image

利用机器学习将模拟噪声和激光限制的多维光谱映射到分子电子耦合†
二维电子能谱(2DES)在生物和合成能量传导系统中都有重大发现。虽然从2DES中获取化学信息是一项复杂的任务,但机器学习(ML)为将复杂的光谱数据转化为物理洞察提供了令人兴奋的机会。近年来的研究发现,神经网络可以高精度地将模拟的多维光谱映射到分子尺度上。然而,模拟通常不能捕捉到影响真实光谱的实验因素,包括噪声和次优脉冲共振条件,这使得在模拟数据上训练的神经网络的实验效用受到质疑。在这里,我们展示了与实验二维光谱数据相关的因素如何影响神经网络将模拟的2DES光谱映射到潜在的分子间电子耦合的能力。通过系统地将多源噪声引入到356000个模拟二维光谱库中,我们发现,如果不相关的加性、相关加性或强度相关的噪声源分别占主导地位,噪声不会影响神经网络在超过阈值信噪比(SNR)约12.4、2.5和5.1的频谱中的性能。与基于人的2DES数据分析形成鲜明对比的是,当数据受到泵浦脉冲的带宽和中心频率的约束时,我们发现神经网络的准确率显著提高(约84%→96%)。这一结果与神经网络学习Kasha分子激子理论描述的光学趋势一致。我们的研究结果为适应模拟训练的神经网络从本质上不完善的实验2DES数据中提取分子特性提供了积极的前景。更广泛地说,我们提出,非线性光谱数据的机器学习视角可能会为实验设计提供独特的、可能违反直觉的指导方针。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信