Harrison Edmonds, Sudipta S Mukherjee, Brooke Holcombe, Kevin Yeh, Rohit Bhargava, Ayanjeet Ghosh
{"title":"Quantification of Protein Secondary Structures from Discrete Frequency Infrared Images Using Machine Learning.","authors":"Harrison Edmonds, Sudipta S Mukherjee, Brooke Holcombe, Kevin Yeh, Rohit Bhargava, Ayanjeet Ghosh","doi":"10.1177/00037028251325553","DOIUrl":null,"url":null,"abstract":"<p><p>Discrete frequency infrared (IR) imaging is an exciting experimental technique that has shown promise in various applications in biomedical science. This technique often involves acquiring IR absorptive images at specific frequencies of interest that enable pathologically relevant chemical contrast. However, certain applications, such as tracking the spatial variations in protein secondary structure of tissue specimens, necessary for the characterization of neurodegenerative diseases, require deeper analysis of spectral data. In such cases, the conventional analytical approach involves band fitting the hyperspectral data to extract the relative populations of different structures through their fitted areas under the curve (AUC). While Gaussian spectral fitting for one spectrum is viable, expanding that to an image with millions of pixels, as often applicable for tissue specimens, becomes a computationally expensive process. Alternatives like principal component analysis (PCA) are less structurally interpretable and incompatible with sparsely sampled data. Furthermore, this detracts from the key advantages of discrete frequency imaging by necessitating the acquisition of more finely sampled spectral data that is optimal for curve fitting, resulting in significantly longer data acquisition times, larger datasets, and additional computational overhead. In this work, we demonstrate that a simple two-step regressive neural network model can be utilized to mitigate these challenges and employ discrete frequency imaging for retrieving the results from band fitting without significant loss of fidelity. Our model reduces the data acquisition time nearly six-fold by requiring only seven wavenumbers to accurately interpolate spectral information at a higher resolution and subsequently using the upscaled spectra to accurately predict the component AUCs, which is more than 3000 times faster than spectral fitting. Our approach thus drastically cuts down the data acquisition and analysis time and predicts key differences in protein structure that can be vital towards broadening potential applications of discrete frequency imaging.</p>","PeriodicalId":8253,"journal":{"name":"Applied Spectroscopy","volume":" ","pages":"37028251325553"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1177/00037028251325553","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Discrete frequency infrared (IR) imaging is an exciting experimental technique that has shown promise in various applications in biomedical science. This technique often involves acquiring IR absorptive images at specific frequencies of interest that enable pathologically relevant chemical contrast. However, certain applications, such as tracking the spatial variations in protein secondary structure of tissue specimens, necessary for the characterization of neurodegenerative diseases, require deeper analysis of spectral data. In such cases, the conventional analytical approach involves band fitting the hyperspectral data to extract the relative populations of different structures through their fitted areas under the curve (AUC). While Gaussian spectral fitting for one spectrum is viable, expanding that to an image with millions of pixels, as often applicable for tissue specimens, becomes a computationally expensive process. Alternatives like principal component analysis (PCA) are less structurally interpretable and incompatible with sparsely sampled data. Furthermore, this detracts from the key advantages of discrete frequency imaging by necessitating the acquisition of more finely sampled spectral data that is optimal for curve fitting, resulting in significantly longer data acquisition times, larger datasets, and additional computational overhead. In this work, we demonstrate that a simple two-step regressive neural network model can be utilized to mitigate these challenges and employ discrete frequency imaging for retrieving the results from band fitting without significant loss of fidelity. Our model reduces the data acquisition time nearly six-fold by requiring only seven wavenumbers to accurately interpolate spectral information at a higher resolution and subsequently using the upscaled spectra to accurately predict the component AUCs, which is more than 3000 times faster than spectral fitting. Our approach thus drastically cuts down the data acquisition and analysis time and predicts key differences in protein structure that can be vital towards broadening potential applications of discrete frequency imaging.
期刊介绍:
Applied Spectroscopy is one of the world''s leading spectroscopy journals, publishing high-quality peer-reviewed articles, both fundamental and applied, covering all aspects of spectroscopy. Established in 1951, the journal is owned by the Society for Applied Spectroscopy and is published monthly. The journal is dedicated to fulfilling the mission of the Society to “…advance and disseminate knowledge and information concerning the art and science of spectroscopy and other allied sciences.”