Enhancing Vaxign-DL for Vaccine Candidate Prediction with added ESM-Generated Features

Yichao Chen, Yuhan Zhang, Yongqun He
{"title":"Enhancing Vaxign-DL for Vaccine Candidate Prediction with added ESM-Generated Features","authors":"Yichao Chen, Yuhan Zhang, Yongqun He","doi":"10.1101/2024.09.04.611295","DOIUrl":null,"url":null,"abstract":"Many vaccine design programs have been developed, including our own machine learning approaches Vaxign-ML and Vaxign-DL. Using deep learning techniques, Vaxign-DL predicts bacterial protective antigens by calculating 509 biological and biomedical features from protein sequences. In this study, we first used the protein folding ESM program to calculate a set of 1,280 features from individual protein sequences, and then utilized the new set of features separately or in combination with the traditional set of 509 features to predict protective antigens. Our result showed that the usage of ESM-derived features alone was able to accurately predict vaccine antigens with a performance similar to the orginal Vaxign-DL prediction method, and the usage of the combined ESM-derived and orginal Vaxign-DL features significantly improved the prediction performance according to a set of seven scores including specificity, sensitivity, and AUROC. To further evaluate the updated methods, we conducted a Leave-One-Pathogen-Out Validation (LOPOV) study, and found that the usage of ESM-derived features significantly improved the the prediction of vaccine antigens from 10 bacterial pathogens. This research is the first reported study demonstrating the added value of protein folding features for vaccine antigen prediction.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.04.611295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Many vaccine design programs have been developed, including our own machine learning approaches Vaxign-ML and Vaxign-DL. Using deep learning techniques, Vaxign-DL predicts bacterial protective antigens by calculating 509 biological and biomedical features from protein sequences. In this study, we first used the protein folding ESM program to calculate a set of 1,280 features from individual protein sequences, and then utilized the new set of features separately or in combination with the traditional set of 509 features to predict protective antigens. Our result showed that the usage of ESM-derived features alone was able to accurately predict vaccine antigens with a performance similar to the orginal Vaxign-DL prediction method, and the usage of the combined ESM-derived and orginal Vaxign-DL features significantly improved the prediction performance according to a set of seven scores including specificity, sensitivity, and AUROC. To further evaluate the updated methods, we conducted a Leave-One-Pathogen-Out Validation (LOPOV) study, and found that the usage of ESM-derived features significantly improved the the prediction of vaccine antigens from 10 bacterial pathogens. This research is the first reported study demonstrating the added value of protein folding features for vaccine antigen prediction.
通过添加 ESM 生成的特征来增强 Vaxign-DL 的候选疫苗预测功能
目前已开发出许多疫苗设计程序,包括我们自己的机器学习方法 Vaxign-ML 和 Vaxign-DL。Vaxign-DL 使用深度学习技术,通过计算蛋白质序列中的 509 个生物和生物医学特征来预测细菌保护性抗原。在本研究中,我们首先使用蛋白质折叠ESM程序计算了来自单个蛋白质序列的1280个特征集,然后利用新特征集单独或结合传统的509个特征集来预测保护性抗原。我们的结果表明,单独使用ESM衍生特征能够准确预测疫苗抗原,其性能与原始Vaxign-DL预测方法相似,而根据特异性、灵敏度和AUROC等七项评分,结合使用ESM衍生特征和原始Vaxign-DL特征能显著提高预测性能。为了进一步评估更新后的方法,我们进行了一次 "单病原体排除验证"(LOPOV)研究,结果发现,ESM 衍生特征的使用明显改善了对 10 种细菌病原体疫苗抗原的预测。这项研究是首次报道蛋白质折叠特征在疫苗抗原预测中的附加值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信