Machine learning framework to extract physicochemical features of B-cell epitopes recognized by a cross-reactive antibody.

IF 3.5 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Simranjit Grewal, Uwa Iyamu, Daniel Ferrer Vinals, Catherine J Mitran, Nidhi Hegde, Stephanie K Yanow
{"title":"Machine learning framework to extract physicochemical features of B-cell epitopes recognized by a cross-reactive antibody.","authors":"Simranjit Grewal, Uwa Iyamu, Daniel Ferrer Vinals, Catherine J Mitran, Nidhi Hegde, Stephanie K Yanow","doi":"10.1038/s41540-025-00583-1","DOIUrl":null,"url":null,"abstract":"<p><p>During infection with Plasmodium falciparum in pregnancy, parasites express a unique virulence factor, VAR2CSA, that mediates binding of infected red blood cells to the placenta. A major goal in designing vaccines to protect pregnant women from malaria is to elicit antibodies to VAR2CSA. The challenge is that VAR2CSA is highly polymorphic and identifying conserved epitopes is essential to elicit strain-transcending immunity. Unexpectedly, a mouse monoclonal antibody, 3D10, raised against region II of the unrelated Duffy binding protein from P. vivax (DBPII) cross-reacts with diverse alleles of VAR2CSA in vitro, suggesting that epitopes may be shared across this family of 'Duffy binding-like' (DBL) proteins. Peptide arrays spanning four DBL proteins from two Plasmodium spp, including two alleles of VAR2CSA, DBPII, and PvEBP2 (as a negative control), were screened with 3D10 but the data were too complex to manually identify common epitope sequences. As such, we designed a machine learning framework to analyse the array data. We applied decision trees to extract features correlated to 3D10 binding and evaluated the model on an independent dataset for a rodent Plasmodium DBL protein (PcDBP). Next, we analysed patterns of the features predicted by the model to be strongly associated with 3D10 binding and designed mutant peptides to test complex sequence motifs. Features associated with 3D10 reactivity were mapped onto predicted 3D structures of Plasmodium proteins and validated based on 3D10 reactivity to the recombinant antigens. While the array data identified certain linear epitopes, the framework predicted other epitopes to be conformational. This was demonstrated with PcDBP; as predicted by the model, no linear peptides reacted strongly with 3D10, yet the folded protein was recognized by the antibody in a conformation-dependent manner. With this approach, peptide array data can be mined to extract physicochemical properties of epitopes recognized by cross-reactive antibodies.</p>","PeriodicalId":19345,"journal":{"name":"NPJ Systems Biology and Applications","volume":"11 1","pages":"109"},"PeriodicalIF":3.5000,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12491407/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"NPJ Systems Biology and Applications","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1038/s41540-025-00583-1","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

During infection with Plasmodium falciparum in pregnancy, parasites express a unique virulence factor, VAR2CSA, that mediates binding of infected red blood cells to the placenta. A major goal in designing vaccines to protect pregnant women from malaria is to elicit antibodies to VAR2CSA. The challenge is that VAR2CSA is highly polymorphic and identifying conserved epitopes is essential to elicit strain-transcending immunity. Unexpectedly, a mouse monoclonal antibody, 3D10, raised against region II of the unrelated Duffy binding protein from P. vivax (DBPII) cross-reacts with diverse alleles of VAR2CSA in vitro, suggesting that epitopes may be shared across this family of 'Duffy binding-like' (DBL) proteins. Peptide arrays spanning four DBL proteins from two Plasmodium spp, including two alleles of VAR2CSA, DBPII, and PvEBP2 (as a negative control), were screened with 3D10 but the data were too complex to manually identify common epitope sequences. As such, we designed a machine learning framework to analyse the array data. We applied decision trees to extract features correlated to 3D10 binding and evaluated the model on an independent dataset for a rodent Plasmodium DBL protein (PcDBP). Next, we analysed patterns of the features predicted by the model to be strongly associated with 3D10 binding and designed mutant peptides to test complex sequence motifs. Features associated with 3D10 reactivity were mapped onto predicted 3D structures of Plasmodium proteins and validated based on 3D10 reactivity to the recombinant antigens. While the array data identified certain linear epitopes, the framework predicted other epitopes to be conformational. This was demonstrated with PcDBP; as predicted by the model, no linear peptides reacted strongly with 3D10, yet the folded protein was recognized by the antibody in a conformation-dependent manner. With this approach, peptide array data can be mined to extract physicochemical properties of epitopes recognized by cross-reactive antibodies.

利用机器学习框架提取由交叉反应抗体识别的b细胞表位的物理化学特征。
在妊娠期感染恶性疟原虫期间,寄生虫表达一种独特的毒力因子VAR2CSA,该因子介导被感染的红细胞与胎盘的结合。设计保护孕妇免受疟疾侵害的疫苗的一个主要目标是激发针对VAR2CSA的抗体。挑战在于VAR2CSA是高度多态性的,鉴定保守的表位对于引发菌株超越免疫至关重要。出乎意料的是,一种针对间日疟原虫Duffy结合蛋白II区(DBPII)的小鼠单克隆抗体3D10在体外与VAR2CSA的多种等位基因发生交叉反应,表明这个“Duffy结合样”(DBL)蛋白家族可能共享表位。利用3D10筛选了2种疟原虫的4个DBL蛋白,包括VAR2CSA、DBPII和PvEBP2(作为阴性对照)两个等位基因,但数据过于复杂,无法手动识别共同表位序列。因此,我们设计了一个机器学习框架来分析数组数据。我们应用决策树提取3D10结合相关特征,并在一个独立的啮齿动物疟原虫DBL蛋白(PcDBP)数据集上对模型进行评估。接下来,我们分析了模型预测的与3D10结合密切相关的特征模式,并设计了突变肽来测试复杂的序列基序。将与3D10反应性相关的特征映射到预测的疟原虫蛋白的3D结构上,并根据3D10对重组抗原的反应性进行验证。虽然阵列数据确定了某些线性表位,但框架预测其他表位是构象的。PcDBP证实了这一点;正如模型预测的那样,没有线性肽与3D10发生强烈反应,但折叠后的蛋白以构象依赖的方式被抗体识别。利用这种方法,可以挖掘肽阵列数据来提取交叉反应抗体识别的表位的物理化学性质。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
NPJ Systems Biology and Applications
NPJ Systems Biology and Applications Mathematics-Applied Mathematics
CiteScore
5.80
自引率
0.00%
发文量
46
审稿时长
8 weeks
期刊介绍: npj Systems Biology and Applications is an online Open Access journal dedicated to publishing the premier research that takes a systems-oriented approach. The journal aims to provide a forum for the presentation of articles that help define this nascent field, as well as those that apply the advances to wider fields. We encourage studies that integrate, or aid the integration of, data, analyses and insight from molecules to organisms and broader systems. Important areas of interest include not only fundamental biological systems and drug discovery, but also applications to health, medical practice and implementation, big data, biotechnology, food science, human behaviour, broader biological systems and industrial applications of systems biology. We encourage all approaches, including network biology, application of control theory to biological systems, computational modelling and analysis, comprehensive and/or high-content measurements, theoretical, analytical and computational studies of system-level properties of biological systems and computational/software/data platforms enabling such studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信