HLAPepBinder:预测hla -肽结合的集成模型。

IF 1.6 4区 生物学 Q4 BIOTECHNOLOGY & APPLIED MICROBIOLOGY
Mahsa Saadat, Fatemeh Zare-Mirakabad, Ali Masoudi-Nejad, Mohammad Farahanchi Baradaran, Nazanin Hosseinkhan
{"title":"HLAPepBinder:预测hla -肽结合的集成模型。","authors":"Mahsa Saadat, Fatemeh Zare-Mirakabad, Ali Masoudi-Nejad, Mohammad Farahanchi Baradaran, Nazanin Hosseinkhan","doi":"10.30498/ijb.2024.459448.3927","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Human leukocyte antigens (HLAs) play a pivotal role in orchestrating the host's immune response, offering a promising avenue with reduced adverse effects compared to conventional treatments. Cancer immunotherapies use HLA class I molecules for T cells to recognize tumor antigens, emphasizing the importance of identifying peptides that bind effectively to HLAs. Computer modeling of HLA-peptide binding speeds up the search for immunogenic epitopes, which enhances the prospect of personalized medicine and targeted therapies. The Immune Epitope Database (IEDB) is a vital repository, housing curated immune epitope data and prediction tools for HLA-peptide binding. It can be challenging for immunologists to choose the best tool from the IEDB for predicting HLA-peptide binding. This has led to the creation of consensus-based methods that combine the results of several predictors. One of the major challenges in these methods is how to effectively integrate the results from multiple predictors.</p><p><strong>Objectives: </strong>Previous consensus-based methods integrate at most three tools by relying on simple strategies, such as selecting prediction methods based on their proximity to HLA in training data. In this study, we introduce HLAPepBinder, a novel consensus approach using ensemble machine learning methods to predict HLA-peptide binding, addressing the challenges biologists face in model selection.</p><p><strong>Materials and methods: </strong>The key contribution is the development of an automatic pipeline named <i>HLAPepBinder</i> that integrates the predictions of multiple models using a random forest approach. Unlike previous approaches, <i>HLAPepBinder</i> seamlessly integrates results from all nine predictors, providing a comprehensive and accurate predictive framework. By combining the strengths of these models, <i>HLAPepBinder</i> eliminates the need for manual model selection, providing a streamlined and reliable solution for biologists.</p><p><strong>Results: </strong><i>HLAPepBinder</i> offers a practical and high-performing alternative for HLA-peptide binding predictions, outperforming both traditional methods and complex deep learning models. Compared to the recently introduced transformer-based model, TranspHLA, which requires substantial computational resources, <i>HLAPepBinder</i> demonstrates superior performance in both prediction accuracy and resource efficiency. Notably, it operates effectively in limited computational environments, making it accessible to researchers with minimal resources. The codes are available online at https://github.com/CBRC-lab/HLAPepBinder.</p><p><strong>Conclusion: </strong>Our study introduces a novel ensemble-learning model designed to enhance the accuracy and efficiency of HLA-peptide binding predictions. Due to the lack of reliable negative data and the typical assumption of unknown interactions being negative, we focus on analyzing the unknown HLA-peptide bindings in the test set that our model predicts with 100% certainty as positive bindings. Using <i>HLAPepBinder</i>, we identify 26 HLA-peptide pairs with absolute prediction confidence. These predictions are validated through a multi-step pipeline involving literature review, BLAST sequence similarity analysis, and molecular docking studies. This comprehensive validation process highlights <i>HLAPepBinder</i>'s ability to make accurate and reliable predictions, contributing significantly to advancements in immunotherapy and vaccine development.</p>","PeriodicalId":14492,"journal":{"name":"Iranian Journal of Biotechnology","volume":"22 4","pages":"e3927"},"PeriodicalIF":1.6000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993240/pdf/","citationCount":"0","resultStr":"{\"title\":\"<i>HLAPepBinder</i>: An Ensemble Model for The Prediction Of HLA-Peptide Binding.\",\"authors\":\"Mahsa Saadat, Fatemeh Zare-Mirakabad, Ali Masoudi-Nejad, Mohammad Farahanchi Baradaran, Nazanin Hosseinkhan\",\"doi\":\"10.30498/ijb.2024.459448.3927\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Human leukocyte antigens (HLAs) play a pivotal role in orchestrating the host's immune response, offering a promising avenue with reduced adverse effects compared to conventional treatments. Cancer immunotherapies use HLA class I molecules for T cells to recognize tumor antigens, emphasizing the importance of identifying peptides that bind effectively to HLAs. Computer modeling of HLA-peptide binding speeds up the search for immunogenic epitopes, which enhances the prospect of personalized medicine and targeted therapies. The Immune Epitope Database (IEDB) is a vital repository, housing curated immune epitope data and prediction tools for HLA-peptide binding. It can be challenging for immunologists to choose the best tool from the IEDB for predicting HLA-peptide binding. This has led to the creation of consensus-based methods that combine the results of several predictors. One of the major challenges in these methods is how to effectively integrate the results from multiple predictors.</p><p><strong>Objectives: </strong>Previous consensus-based methods integrate at most three tools by relying on simple strategies, such as selecting prediction methods based on their proximity to HLA in training data. In this study, we introduce HLAPepBinder, a novel consensus approach using ensemble machine learning methods to predict HLA-peptide binding, addressing the challenges biologists face in model selection.</p><p><strong>Materials and methods: </strong>The key contribution is the development of an automatic pipeline named <i>HLAPepBinder</i> that integrates the predictions of multiple models using a random forest approach. Unlike previous approaches, <i>HLAPepBinder</i> seamlessly integrates results from all nine predictors, providing a comprehensive and accurate predictive framework. By combining the strengths of these models, <i>HLAPepBinder</i> eliminates the need for manual model selection, providing a streamlined and reliable solution for biologists.</p><p><strong>Results: </strong><i>HLAPepBinder</i> offers a practical and high-performing alternative for HLA-peptide binding predictions, outperforming both traditional methods and complex deep learning models. Compared to the recently introduced transformer-based model, TranspHLA, which requires substantial computational resources, <i>HLAPepBinder</i> demonstrates superior performance in both prediction accuracy and resource efficiency. Notably, it operates effectively in limited computational environments, making it accessible to researchers with minimal resources. The codes are available online at https://github.com/CBRC-lab/HLAPepBinder.</p><p><strong>Conclusion: </strong>Our study introduces a novel ensemble-learning model designed to enhance the accuracy and efficiency of HLA-peptide binding predictions. Due to the lack of reliable negative data and the typical assumption of unknown interactions being negative, we focus on analyzing the unknown HLA-peptide bindings in the test set that our model predicts with 100% certainty as positive bindings. Using <i>HLAPepBinder</i>, we identify 26 HLA-peptide pairs with absolute prediction confidence. These predictions are validated through a multi-step pipeline involving literature review, BLAST sequence similarity analysis, and molecular docking studies. This comprehensive validation process highlights <i>HLAPepBinder</i>'s ability to make accurate and reliable predictions, contributing significantly to advancements in immunotherapy and vaccine development.</p>\",\"PeriodicalId\":14492,\"journal\":{\"name\":\"Iranian Journal of Biotechnology\",\"volume\":\"22 4\",\"pages\":\"e3927\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993240/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Iranian Journal of Biotechnology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.30498/ijb.2024.459448.3927\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"BIOTECHNOLOGY & APPLIED MICROBIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iranian Journal of Biotechnology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.30498/ijb.2024.459448.3927","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOTECHNOLOGY & APPLIED MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:人类白细胞抗原(hla)在协调宿主的免疫反应中起着关键作用,与传统治疗相比,它提供了一种有希望的途径,减少了不良反应。癌症免疫疗法使用HLA I类分子让T细胞识别肿瘤抗原,强调识别与HLA有效结合的肽的重要性。hla -肽结合的计算机建模加速了对免疫原性表位的寻找,这增强了个性化医疗和靶向治疗的前景。免疫表位数据库(IEDB)是一个重要的存储库,包含了精心策划的免疫表位数据和hla肽结合的预测工具。对于免疫学家来说,从IEDB中选择预测hla肽结合的最佳工具是一项挑战。这导致了基于共识的方法的产生,这些方法结合了几个预测因素的结果。这些方法的主要挑战之一是如何有效地整合来自多个预测器的结果。目的:以往基于共识的方法依靠简单的策略,例如根据训练数据中HLA的接近度选择预测方法,最多集成三种工具。在这项研究中,我们介绍了HLAPepBinder,这是一种新的共识方法,使用集成机器学习方法来预测hla肽结合,解决了生物学家在模型选择方面面临的挑战。材料和方法:关键贡献是开发了一个名为HLAPepBinder的自动管道,该管道使用随机森林方法集成了多个模型的预测。与以前的方法不同,HLAPepBinder无缝集成了所有9种预测指标的结果,提供了全面准确的预测框架。通过结合这些模型的优势,HLAPepBinder消除了手动模型选择的需要,为生物学家提供了简化和可靠的解决方案。结果:HLAPepBinder为hla肽结合预测提供了一种实用且高性能的替代方案,优于传统方法和复杂的深度学习模型。与最近推出的基于变压器的transhla模型相比,该模型需要大量的计算资源,HLAPepBinder在预测精度和资源效率方面都表现出卓越的性能。值得注意的是,它在有限的计算环境中有效地运行,使研究人员能够以最少的资源访问它。我们的研究引入了一种新的集成学习模型,旨在提高hla肽结合预测的准确性和效率。由于缺乏可靠的负数据和未知相互作用为负的典型假设,我们专注于分析测试集中未知的hla -肽结合,我们的模型以100%的确定性预测为正结合。使用HLAPepBinder,我们鉴定了26对hla肽对,具有绝对的预测置信度。这些预测通过包括文献回顾、BLAST序列相似性分析和分子对接研究在内的多步骤管道进行验证。这一全面的验证过程突出了HLAPepBinder做出准确可靠预测的能力,为免疫治疗和疫苗开发的进步做出了重大贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
HLAPepBinder: An Ensemble Model for The Prediction Of HLA-Peptide Binding.

Background: Human leukocyte antigens (HLAs) play a pivotal role in orchestrating the host's immune response, offering a promising avenue with reduced adverse effects compared to conventional treatments. Cancer immunotherapies use HLA class I molecules for T cells to recognize tumor antigens, emphasizing the importance of identifying peptides that bind effectively to HLAs. Computer modeling of HLA-peptide binding speeds up the search for immunogenic epitopes, which enhances the prospect of personalized medicine and targeted therapies. The Immune Epitope Database (IEDB) is a vital repository, housing curated immune epitope data and prediction tools for HLA-peptide binding. It can be challenging for immunologists to choose the best tool from the IEDB for predicting HLA-peptide binding. This has led to the creation of consensus-based methods that combine the results of several predictors. One of the major challenges in these methods is how to effectively integrate the results from multiple predictors.

Objectives: Previous consensus-based methods integrate at most three tools by relying on simple strategies, such as selecting prediction methods based on their proximity to HLA in training data. In this study, we introduce HLAPepBinder, a novel consensus approach using ensemble machine learning methods to predict HLA-peptide binding, addressing the challenges biologists face in model selection.

Materials and methods: The key contribution is the development of an automatic pipeline named HLAPepBinder that integrates the predictions of multiple models using a random forest approach. Unlike previous approaches, HLAPepBinder seamlessly integrates results from all nine predictors, providing a comprehensive and accurate predictive framework. By combining the strengths of these models, HLAPepBinder eliminates the need for manual model selection, providing a streamlined and reliable solution for biologists.

Results: HLAPepBinder offers a practical and high-performing alternative for HLA-peptide binding predictions, outperforming both traditional methods and complex deep learning models. Compared to the recently introduced transformer-based model, TranspHLA, which requires substantial computational resources, HLAPepBinder demonstrates superior performance in both prediction accuracy and resource efficiency. Notably, it operates effectively in limited computational environments, making it accessible to researchers with minimal resources. The codes are available online at https://github.com/CBRC-lab/HLAPepBinder.

Conclusion: Our study introduces a novel ensemble-learning model designed to enhance the accuracy and efficiency of HLA-peptide binding predictions. Due to the lack of reliable negative data and the typical assumption of unknown interactions being negative, we focus on analyzing the unknown HLA-peptide bindings in the test set that our model predicts with 100% certainty as positive bindings. Using HLAPepBinder, we identify 26 HLA-peptide pairs with absolute prediction confidence. These predictions are validated through a multi-step pipeline involving literature review, BLAST sequence similarity analysis, and molecular docking studies. This comprehensive validation process highlights HLAPepBinder's ability to make accurate and reliable predictions, contributing significantly to advancements in immunotherapy and vaccine development.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Iranian Journal of Biotechnology
Iranian Journal of Biotechnology BIOTECHNOLOGY & APPLIED MICROBIOLOGY-
CiteScore
2.60
自引率
7.70%
发文量
20
期刊介绍: Iranian Journal of Biotechnology (IJB) is published quarterly by the National Institute of Genetic Engineering and Biotechnology. IJB publishes original scientific research papers in the broad area of Biotechnology such as, Agriculture, Animal and Marine Sciences, Basic Sciences, Bioinformatics, Biosafety and Bioethics, Environment, Industry and Mining and Medical Sciences.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信