基于机器学习的肿瘤T细胞抗原识别方法综述与评价

IF 2.6 4区生物学 Q2 BIOLOGY

Computational Biology and Chemistry Pub Date : 2025-04-05 DOI:10.1016/j.compbiolchem.2025.108440

Watshara Shoombuatong , Saeed Ahmed , SM Hasan Mahmud , Nalini Schaduangrat

{"title":"基于机器学习的肿瘤T细胞抗原识别方法综述与评价","authors":"Watshara Shoombuatong , Saeed Ahmed , SM Hasan Mahmud , Nalini Schaduangrat","doi":"10.1016/j.compbiolchem.2025.108440","DOIUrl":null,"url":null,"abstract":"<div><div>The precise identification of tumor T-cell antigens (TTCAs) is crucial for advancements in cancer immunotherapy and other clinical uses. In contrast to the labor-intensive and time-consuming process of experimentally identifying TTCAs, computational prediction offers a complementary approach by providing a shortlist of probable TTCA candidates for further experimental validation. Currently, several computational approaches, primarily based on machine learning (ML) methods, have garnered considerable attention for the <em>in silico</em> identification of tumor T-cell antigens (TTCAs). Therefore, this study presents a comprehensive survey on the existing state-of-the-art TTCA predictors. Based on our research, this is the first comprehensive review focused on both traditional ML and ensemble learning methods for TTCA identification. Specifically, we examine critical aspects of TTCA predictor development, including core algorithms, methodologies, benchmark datasets, feature encoding methods, feature selection approaches, and web server usability. We then analyze and compare the effectiveness and robustness of existing predictors across well-known benchmark datasets and case studies. Finally, we provide a detailed summary of the advantages and disadvantages of current TTCA predictors, along with essential insights and suggestions for developing novel computational approaches to accurately identify TTCAs. The insights gained from this review and benchmarking survey are expected to offer valuable guidance to researchers, aiding in the development of high-accuracy TTCA predictors for improved antigen identification in the future.</div></div>","PeriodicalId":10616,"journal":{"name":"Computational Biology and Chemistry","volume":"118 ","pages":"Article 108440"},"PeriodicalIF":2.6000,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comprehensive review and evaluation of machine learning-based approaches for identifying tumor T cell antigens\",\"authors\":\"Watshara Shoombuatong , Saeed Ahmed , SM Hasan Mahmud , Nalini Schaduangrat\",\"doi\":\"10.1016/j.compbiolchem.2025.108440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The precise identification of tumor T-cell antigens (TTCAs) is crucial for advancements in cancer immunotherapy and other clinical uses. In contrast to the labor-intensive and time-consuming process of experimentally identifying TTCAs, computational prediction offers a complementary approach by providing a shortlist of probable TTCA candidates for further experimental validation. Currently, several computational approaches, primarily based on machine learning (ML) methods, have garnered considerable attention for the <em>in silico</em> identification of tumor T-cell antigens (TTCAs). Therefore, this study presents a comprehensive survey on the existing state-of-the-art TTCA predictors. Based on our research, this is the first comprehensive review focused on both traditional ML and ensemble learning methods for TTCA identification. Specifically, we examine critical aspects of TTCA predictor development, including core algorithms, methodologies, benchmark datasets, feature encoding methods, feature selection approaches, and web server usability. We then analyze and compare the effectiveness and robustness of existing predictors across well-known benchmark datasets and case studies. Finally, we provide a detailed summary of the advantages and disadvantages of current TTCA predictors, along with essential insights and suggestions for developing novel computational approaches to accurately identify TTCAs. The insights gained from this review and benchmarking survey are expected to offer valuable guidance to researchers, aiding in the development of high-accuracy TTCA predictors for improved antigen identification in the future.</div></div>\",\"PeriodicalId\":10616,\"journal\":{\"name\":\"Computational Biology and Chemistry\",\"volume\":\"118 \",\"pages\":\"Article 108440\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational Biology and Chemistry\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1476927125001008\",\"RegionNum\":4,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Biology and Chemistry","FirstCategoryId":"99","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1476927125001008","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

肿瘤t细胞抗原（TTCAs）的精确鉴定对于癌症免疫治疗和其他临床应用的进展至关重要。与实验鉴定TTCA的劳动密集型和耗时的过程相比，计算预测提供了一种补充方法，通过提供可能的TTCA候选者名单进行进一步的实验验证。目前，几种主要基于机器学习（ML）方法的计算方法在肿瘤t细胞抗原（TTCAs）的计算机识别方面引起了相当大的关注。因此，本研究对现有最先进的TTCA预测指标进行了全面调查。基于我们的研究，这是第一次对TTCA识别的传统ML和集成学习方法进行全面回顾。具体来说，我们研究了TTCA预测器开发的关键方面，包括核心算法、方法、基准数据集、特征编码方法、特征选择方法和web服务器可用性。然后，我们分析和比较了已知基准数据集和案例研究中现有预测器的有效性和稳健性。最后，我们详细总结了当前TTCA预测器的优缺点，以及开发新的计算方法来准确识别TTCA的基本见解和建议。从本综述和基准调查中获得的见解有望为研究人员提供有价值的指导，帮助开发高精度的TTCA预测因子，以改善未来的抗原鉴定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comprehensive review and evaluation of machine learning-based approaches for identifying tumor T cell antigens

The precise identification of tumor T-cell antigens (TTCAs) is crucial for advancements in cancer immunotherapy and other clinical uses. In contrast to the labor-intensive and time-consuming process of experimentally identifying TTCAs, computational prediction offers a complementary approach by providing a shortlist of probable TTCA candidates for further experimental validation. Currently, several computational approaches, primarily based on machine learning (ML) methods, have garnered considerable attention for the in silico identification of tumor T-cell antigens (TTCAs). Therefore, this study presents a comprehensive survey on the existing state-of-the-art TTCA predictors. Based on our research, this is the first comprehensive review focused on both traditional ML and ensemble learning methods for TTCA identification. Specifically, we examine critical aspects of TTCA predictor development, including core algorithms, methodologies, benchmark datasets, feature encoding methods, feature selection approaches, and web server usability. We then analyze and compare the effectiveness and robustness of existing predictors across well-known benchmark datasets and case studies. Finally, we provide a detailed summary of the advantages and disadvantages of current TTCA predictors, along with essential insights and suggestions for developing novel computational approaches to accurately identify TTCAs. The insights gained from this review and benchmarking survey are expected to offer valuable guidance to researchers, aiding in the development of high-accuracy TTCA predictors for improved antigen identification in the future.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computational Biology and Chemistry 生物-计算机：跨学科应用

CiteScore

6.10

自引率

3.20%

发文量

142

审稿时长

24 days

期刊介绍： Computational Biology and Chemistry publishes original research papers and review articles in all areas of computational life sciences. High quality research contributions with a major computational component in the areas of nucleic acid and protein sequence research, molecular evolution, molecular genetics (functional genomics and proteomics), theory and practice of either biology-specific or chemical-biology-specific modeling, and structural biology of nucleic acids and proteins are particularly welcome. Exceptionally high quality research work in bioinformatics, systems biology, ecology, computational pharmacology, metabolism, biomedical engineering, epidemiology, and statistical genetics will also be considered. Given their inherent uncertainty, protein modeling and molecular docking studies should be thoroughly validated. In the absence of experimental results for validation, the use of molecular dynamics simulations along with detailed free energy calculations, for example, should be used as complementary techniques to support the major conclusions. Submissions of premature modeling exercises without additional biological insights will not be considered. Review articles will generally be commissioned by the editors and should not be submitted to the journal without explicit invitation. However prospective authors are welcome to send a brief (one to three pages) synopsis, which will be evaluated by the editors.