Eemeli A Eronen, Anton Vladyka, Florent Gerbon, Christoph J Sahle, Johannes Niskanen
{"title":"利用 X 射线吸收光谱确定肽构象的信息瓶颈","authors":"Eemeli A Eronen, Anton Vladyka, Florent Gerbon, Christoph J Sahle, Johannes Niskanen","doi":"10.1088/2399-6528/ad1f73","DOIUrl":null,"url":null,"abstract":"We apply a recently developed technique utilizing machine learning for statistical analysis of computational nitrogen K-edge spectra of aqueous triglycine. This method, the emulator-based component analysis, identifies spectrally relevant structural degrees of freedom from a data set filtering irrelevant ones out. Thus tremendous reduction in the dimensionality of the ill-posed nonlinear inverse problem of spectrum interpretation is achieved. Structural and spectral variation across the sampled phase space is notable. Using these data, we train a neural network to predict the intensities of spectral regions of interest from the structure. These regions are defined by the temperature-difference profile of the simulated spectra, and the analysis yields a structural interpretation for their behavior. Even though the utilized local many-body tensor representation implicitly encodes the secondary structure of the peptide, our approach proves that this information is irrecoverable from the spectra. A hard x-ray Raman scattering experiment confirms the overall sensibility of the simulated spectra, but the predicted temperature-dependent effects therein remain beyond the achieved statistical confidence level.","PeriodicalId":47089,"journal":{"name":"Journal of Physics Communications","volume":"272 1","pages":""},"PeriodicalIF":1.1000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Information bottleneck in peptide conformation determination by x-ray absorption spectroscopy\",\"authors\":\"Eemeli A Eronen, Anton Vladyka, Florent Gerbon, Christoph J Sahle, Johannes Niskanen\",\"doi\":\"10.1088/2399-6528/ad1f73\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We apply a recently developed technique utilizing machine learning for statistical analysis of computational nitrogen K-edge spectra of aqueous triglycine. This method, the emulator-based component analysis, identifies spectrally relevant structural degrees of freedom from a data set filtering irrelevant ones out. Thus tremendous reduction in the dimensionality of the ill-posed nonlinear inverse problem of spectrum interpretation is achieved. Structural and spectral variation across the sampled phase space is notable. Using these data, we train a neural network to predict the intensities of spectral regions of interest from the structure. These regions are defined by the temperature-difference profile of the simulated spectra, and the analysis yields a structural interpretation for their behavior. Even though the utilized local many-body tensor representation implicitly encodes the secondary structure of the peptide, our approach proves that this information is irrecoverable from the spectra. A hard x-ray Raman scattering experiment confirms the overall sensibility of the simulated spectra, but the predicted temperature-dependent effects therein remain beyond the achieved statistical confidence level.\",\"PeriodicalId\":47089,\"journal\":{\"name\":\"Journal of Physics Communications\",\"volume\":\"272 1\",\"pages\":\"\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Physics Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1088/2399-6528/ad1f73\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Physics Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2399-6528/ad1f73","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
我们将最近开发的机器学习技术应用于三甘氨酸水溶液的计算氮 K 边光谱的统计分析。这种基于仿真器的成分分析方法能从数据集中识别出与光谱相关的结构自由度,过滤掉不相关的自由度。这样就大大降低了光谱解释的非线性逆问题的维度。整个采样相空间的结构和频谱变化非常明显。利用这些数据,我们训练了一个神经网络,以从结构中预测感兴趣光谱区域的强度。这些区域是由模拟光谱的温差曲线定义的,分析得出了对其行为的结构解释。尽管所利用的局部多体张量表示法隐含了多肽的二级结构,但我们的方法证明了这一信息是无法从光谱中恢复的。硬 X 射线拉曼散射实验证实了模拟光谱的整体灵敏度,但其中预测的温度效应仍然超出了统计置信水平。
Information bottleneck in peptide conformation determination by x-ray absorption spectroscopy
We apply a recently developed technique utilizing machine learning for statistical analysis of computational nitrogen K-edge spectra of aqueous triglycine. This method, the emulator-based component analysis, identifies spectrally relevant structural degrees of freedom from a data set filtering irrelevant ones out. Thus tremendous reduction in the dimensionality of the ill-posed nonlinear inverse problem of spectrum interpretation is achieved. Structural and spectral variation across the sampled phase space is notable. Using these data, we train a neural network to predict the intensities of spectral regions of interest from the structure. These regions are defined by the temperature-difference profile of the simulated spectra, and the analysis yields a structural interpretation for their behavior. Even though the utilized local many-body tensor representation implicitly encodes the secondary structure of the peptide, our approach proves that this information is irrecoverable from the spectra. A hard x-ray Raman scattering experiment confirms the overall sensibility of the simulated spectra, but the predicted temperature-dependent effects therein remain beyond the achieved statistical confidence level.