Marco Pinto Corujo, Pavel Michal, Dale Ang, Lindo Vivian, Nikola Chmel, Alison Rodger
{"title":"Prediction of Secondary Structure Content of Proteins Using Raman Spectroscopy and Self-Organizing Maps.","authors":"Marco Pinto Corujo, Pavel Michal, Dale Ang, Lindo Vivian, Nikola Chmel, Alison Rodger","doi":"10.1177/00037028251335051","DOIUrl":null,"url":null,"abstract":"<p><p>Proteins are biomolecules with characteristic three-dimensional (3D) arrangements that render them different vital functions. In the last 20 years, there has been a growing interest in biopharmaceutical proteins, especially antibodies, due to their therapeutic application<sup>.</sup> The functionality of a protein depends on the preservation of its native form, which under certain stressing conditions can undergo changes at different structural levels that cause them to lose their activity.<sup>1</sup> Although mass spectrometry is a powerful technique for primary structure determination, it often fails to give information at higher order levels. Like infrared (IR), Raman spectra are well known to contain bands (especially the amide I from 1625-1725cm<sup>-1</sup>) that correlate with secondary structure (SS) content. However, unlike circular dichroism (CD), the most well-established technique for SS analysis, Raman spectroscopy allows a much wider ranges of optical density, making possible the analysis of highly concentrated samples with no prior dilution. Moreover, water is a weak scatterer below 3000 cm<sup>-1</sup>, which confers Raman an advantage over IR for the analysis of complex aqueous pharmaceutical samples as the signal from water dominates the amide I region. The most traditional procedure to extract information on SS content is band-fitting. However, in most cases, we found the method to be ambiguous, limited by spectral noise and subjected to the judgment of the analyzer. Self-organizing maps (SOM) is a type of self-learning algorithm that organizes data in a two-dimensional (2D) space based on spectral similarity and class with no bias from the analyzer and very little effect from noise. In this work, a set of protein spectra with known SS content were collected in both solid and aqueous state with back-scatter Raman spectroscopy and used to train a SOM algorithm for SS prediction. The results were compared with those by partial least squares (PLS) regression, band-fitting, and X-ray data in the literature. The prediction errors observed by SOM were comparable to those by PLS and far from those obtained by band-fitting, proving Raman-SOM as viable alternative to the aforementioned methods.</p>","PeriodicalId":8253,"journal":{"name":"Applied Spectroscopy","volume":" ","pages":"37028251335051"},"PeriodicalIF":2.2000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1177/00037028251335051","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Proteins are biomolecules with characteristic three-dimensional (3D) arrangements that render them different vital functions. In the last 20 years, there has been a growing interest in biopharmaceutical proteins, especially antibodies, due to their therapeutic application. The functionality of a protein depends on the preservation of its native form, which under certain stressing conditions can undergo changes at different structural levels that cause them to lose their activity.1 Although mass spectrometry is a powerful technique for primary structure determination, it often fails to give information at higher order levels. Like infrared (IR), Raman spectra are well known to contain bands (especially the amide I from 1625-1725cm-1) that correlate with secondary structure (SS) content. However, unlike circular dichroism (CD), the most well-established technique for SS analysis, Raman spectroscopy allows a much wider ranges of optical density, making possible the analysis of highly concentrated samples with no prior dilution. Moreover, water is a weak scatterer below 3000 cm-1, which confers Raman an advantage over IR for the analysis of complex aqueous pharmaceutical samples as the signal from water dominates the amide I region. The most traditional procedure to extract information on SS content is band-fitting. However, in most cases, we found the method to be ambiguous, limited by spectral noise and subjected to the judgment of the analyzer. Self-organizing maps (SOM) is a type of self-learning algorithm that organizes data in a two-dimensional (2D) space based on spectral similarity and class with no bias from the analyzer and very little effect from noise. In this work, a set of protein spectra with known SS content were collected in both solid and aqueous state with back-scatter Raman spectroscopy and used to train a SOM algorithm for SS prediction. The results were compared with those by partial least squares (PLS) regression, band-fitting, and X-ray data in the literature. The prediction errors observed by SOM were comparable to those by PLS and far from those obtained by band-fitting, proving Raman-SOM as viable alternative to the aforementioned methods.
期刊介绍:
Applied Spectroscopy is one of the world''s leading spectroscopy journals, publishing high-quality peer-reviewed articles, both fundamental and applied, covering all aspects of spectroscopy. Established in 1951, the journal is owned by the Society for Applied Spectroscopy and is published monthly. The journal is dedicated to fulfilling the mission of the Society to “…advance and disseminate knowledge and information concerning the art and science of spectroscopy and other allied sciences.”