{"title":"TCellR2Vec: efficient feature selection for TCR sequences for cancer classification.","authors":"Zahra Tayebi, Sarwan Ali, Murray Patterson","doi":"10.7717/peerj-cs.2239","DOIUrl":null,"url":null,"abstract":"<p><p>Cancer remains one of the leading causes of death globally. New immunotherapies that harness the patient's immune system to fight cancer show promise, but their development requires analyzing the diversity of immune cells called T-cells. T-cells have receptors that recognize and bind to cancer cells. Sequencing these T-cell receptors allows to provide insights into their immune response, but extracting useful information is challenging. In this study, we propose a new computational method, TCellR2Vec, to select key features from T-cell receptor sequences for classifying different cancer types. We extracted features like amino acid composition, charge, and diversity measures and combined them with other sequence embedding techniques. For our experiments, we used a dataset of over 50,000 T-cell receptor sequences from five cancer types, which showed that TCellR2Vec improved classification accuracy and efficiency over baseline methods. These results demonstrate TCellR2Vec's ability to capture informative aspects of complex T-cell receptor sequences. By improving computational analysis of the immune response, TCellR2Vec could aid the development of personalized immunotherapies tailored to each patient's T-cells. This has important implications for creating more effective cancer treatments based on the individual's immune system.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"10 ","pages":"e2239"},"PeriodicalIF":3.5000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11622898/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2239","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Cancer remains one of the leading causes of death globally. New immunotherapies that harness the patient's immune system to fight cancer show promise, but their development requires analyzing the diversity of immune cells called T-cells. T-cells have receptors that recognize and bind to cancer cells. Sequencing these T-cell receptors allows to provide insights into their immune response, but extracting useful information is challenging. In this study, we propose a new computational method, TCellR2Vec, to select key features from T-cell receptor sequences for classifying different cancer types. We extracted features like amino acid composition, charge, and diversity measures and combined them with other sequence embedding techniques. For our experiments, we used a dataset of over 50,000 T-cell receptor sequences from five cancer types, which showed that TCellR2Vec improved classification accuracy and efficiency over baseline methods. These results demonstrate TCellR2Vec's ability to capture informative aspects of complex T-cell receptor sequences. By improving computational analysis of the immune response, TCellR2Vec could aid the development of personalized immunotherapies tailored to each patient's T-cells. This has important implications for creating more effective cancer treatments based on the individual's immune system.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.