Prediction of Secondary Structure Content of Proteins Using Raman Spectroscopy and Self-Organizing Maps.

IF 2.2 3区 化学 Q2 INSTRUMENTS & INSTRUMENTATION
Marco Pinto Corujo, Pavel Michal, Dale Ang, Lindo Vivian, Nikola Chmel, Alison Rodger
{"title":"Prediction of Secondary Structure Content of Proteins Using Raman Spectroscopy and Self-Organizing Maps.","authors":"Marco Pinto Corujo, Pavel Michal, Dale Ang, Lindo Vivian, Nikola Chmel, Alison Rodger","doi":"10.1177/00037028251335051","DOIUrl":null,"url":null,"abstract":"<p><p>Proteins are biomolecules with characteristic three-dimensional (3D) arrangements that render them different vital functions. In the last 20 years, there has been a growing interest in biopharmaceutical proteins, especially antibodies, due to their therapeutic application<sup>.</sup> The functionality of a protein depends on the preservation of its native form, which under certain stressing conditions can undergo changes at different structural levels that cause them to lose their activity.<sup>1</sup> Although mass spectrometry is a powerful technique for primary structure determination, it often fails to give information at higher order levels. Like infrared (IR), Raman spectra are well known to contain bands (especially the amide I from 1625-1725cm<sup>-1</sup>) that correlate with secondary structure (SS) content. However, unlike circular dichroism (CD), the most well-established technique for SS analysis, Raman spectroscopy allows a much wider ranges of optical density, making possible the analysis of highly concentrated samples with no prior dilution. Moreover, water is a weak scatterer below 3000 cm<sup>-1</sup>, which confers Raman an advantage over IR for the analysis of complex aqueous pharmaceutical samples as the signal from water dominates the amide I region. The most traditional procedure to extract information on SS content is band-fitting. However, in most cases, we found the method to be ambiguous, limited by spectral noise and subjected to the judgment of the analyzer. Self-organizing maps (SOM) is a type of self-learning algorithm that organizes data in a two-dimensional (2D) space based on spectral similarity and class with no bias from the analyzer and very little effect from noise. In this work, a set of protein spectra with known SS content were collected in both solid and aqueous state with back-scatter Raman spectroscopy and used to train a SOM algorithm for SS prediction. The results were compared with those by partial least squares (PLS) regression, band-fitting, and X-ray data in the literature. The prediction errors observed by SOM were comparable to those by PLS and far from those obtained by band-fitting, proving Raman-SOM as viable alternative to the aforementioned methods.</p>","PeriodicalId":8253,"journal":{"name":"Applied Spectroscopy","volume":" ","pages":"37028251335051"},"PeriodicalIF":2.2000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Spectroscopy","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1177/00037028251335051","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0

Abstract

Proteins are biomolecules with characteristic three-dimensional (3D) arrangements that render them different vital functions. In the last 20 years, there has been a growing interest in biopharmaceutical proteins, especially antibodies, due to their therapeutic application. The functionality of a protein depends on the preservation of its native form, which under certain stressing conditions can undergo changes at different structural levels that cause them to lose their activity.1 Although mass spectrometry is a powerful technique for primary structure determination, it often fails to give information at higher order levels. Like infrared (IR), Raman spectra are well known to contain bands (especially the amide I from 1625-1725cm-1) that correlate with secondary structure (SS) content. However, unlike circular dichroism (CD), the most well-established technique for SS analysis, Raman spectroscopy allows a much wider ranges of optical density, making possible the analysis of highly concentrated samples with no prior dilution. Moreover, water is a weak scatterer below 3000 cm-1, which confers Raman an advantage over IR for the analysis of complex aqueous pharmaceutical samples as the signal from water dominates the amide I region. The most traditional procedure to extract information on SS content is band-fitting. However, in most cases, we found the method to be ambiguous, limited by spectral noise and subjected to the judgment of the analyzer. Self-organizing maps (SOM) is a type of self-learning algorithm that organizes data in a two-dimensional (2D) space based on spectral similarity and class with no bias from the analyzer and very little effect from noise. In this work, a set of protein spectra with known SS content were collected in both solid and aqueous state with back-scatter Raman spectroscopy and used to train a SOM algorithm for SS prediction. The results were compared with those by partial least squares (PLS) regression, band-fitting, and X-ray data in the literature. The prediction errors observed by SOM were comparable to those by PLS and far from those obtained by band-fitting, proving Raman-SOM as viable alternative to the aforementioned methods.

利用拉曼光谱和自组织图预测蛋白质二级结构含量。
蛋白质是具有独特的三维(3D)排列的生物分子,使它们具有不同的重要功能。在过去的20年里,由于其治疗应用,人们对生物制药蛋白,特别是抗体的兴趣越来越大。蛋白质的功能取决于其天然形态的保存,在一定的压力条件下,它可以在不同的结构水平上发生变化,导致它们失去活性虽然质谱法是测定初级结构的一种强有力的技术,但它往往不能提供更高层次的信息。与红外(IR)一样,众所周知,拉曼光谱包含与二级结构(SS)含量相关的波段(特别是酰胺I从1625-1725cm-1)。然而,与圆二色性(CD) (SS分析中最成熟的技术)不同,拉曼光谱允许更宽的光密度范围,从而可以在没有事先稀释的情况下分析高浓度样品。此外,水是3000 cm-1以下的弱散射体,这使得拉曼光谱在分析复杂的含水药物样品时比红外光谱更有优势,因为来自水的信号主导了酰胺I区。提取SS含量信息最传统的方法是带拟合。然而,在大多数情况下,我们发现这种方法是模糊的,受频谱噪声的限制,并受到分析仪的判断。自组织映射(SOM)是一种自学习算法,它基于谱相似性和类在二维(2D)空间中组织数据,不受分析仪的偏差和噪声的影响很小。本研究利用后向散射拉曼光谱技术收集了已知SS含量的固体和水相蛋白质光谱,并用于训练用于SS预测的SOM算法。将结果与文献中偏最小二乘(PLS)回归、带拟合和x射线数据进行比较。SOM观测到的预测误差与PLS相当,与带拟合的预测误差相差甚远,证明Raman-SOM是上述方法的可行替代方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Spectroscopy
Applied Spectroscopy 工程技术-光谱学
CiteScore
6.60
自引率
5.70%
发文量
139
审稿时长
3.5 months
期刊介绍: Applied Spectroscopy is one of the world''s leading spectroscopy journals, publishing high-quality peer-reviewed articles, both fundamental and applied, covering all aspects of spectroscopy. Established in 1951, the journal is owned by the Society for Applied Spectroscopy and is published monthly. The journal is dedicated to fulfilling the mission of the Society to “…advance and disseminate knowledge and information concerning the art and science of spectroscopy and other allied sciences.”
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信