Abstract 176: Detecting neoepitopes from tumor RNA sequencing datasets

Journal of bioinformatics and systems biology : Open access Pub Date : 2021-07-01 DOI:10.1158/1538-7445.AM2021-176

D. Thompson, O. Vaske, A. Rao, Holly C. Beale

{"title":"Abstract 176: Detecting neoepitopes from tumor RNA sequencing datasets","authors":"D. Thompson, O. Vaske, A. Rao, Holly C. Beale","doi":"10.1158/1538-7445.AM2021-176","DOIUrl":null,"url":null,"abstract":"Epitopes are peptides that present on the surface of the cell and can be recognized by immune cells to initiate the immune response. Identification of neoepitopes – tumor-specific, MHC-bound epitopes recognized specifically by T-cells – is valuable for predicting response to immunotherapies, including checkpoint blockade therapies. Tumors with more neoepitopes tend to be more responsive to immune checkpoint therapies compared to tumors with fewer neoepitopes. ProTECT is a previously published computational method that uses Illumina whole genome and transcriptome sequencing data from tumor and matched normal tissues to identify neoepitopes. Tumor and normal whole genome sequencing data are used to infer a patient9s HLA haplotypes, as well as annotate variants as either somatic or germline. While whole genome sequencing is comprehensive, it is quite costly and not available for many samples. Here we adapt ProTECT to use only tumor RNA sequencing data and HLA haplotype information available to the clinician to identify neoepitopes in a tumor sample. Prior to running ProTECT, we use the computational tools Opossum and Platypus for variant calling instead of Radia (which is designed for variant calling using both RNA and DNA sequencing data as input). To determine which variants are somatic and therefore could represent tumor neoepitopes, variants found in RNA are compared to a panel of normals, for example the Genome Aggregation Database (gnomAD; containing variants from 125,748 exome sequences and 15,708 whole-genome sequences). With the resulting somatic variants and the HLA type, ProTECT proceeds as usual, with translation of variants into proteins, MHC:Peptide binding predictions and neoepitope ranking. We find that high quality neoepitopes are identifiable using an RNA-only approach, when genomic data is absent. Future work will validate the sensitivity of our method by benchmarking it against the original ProTECT predictions in the TCGA Prostate Adenocarcinoma cohort. Citation Format: Drew Thompson, Olena M. Vaske, Arjun Rao, Holly C. Beale. Detecting neoepitopes from tumor RNA sequencing datasets [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 176.","PeriodicalId":73617,"journal":{"name":"Journal of bioinformatics and systems biology : Open access","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of bioinformatics and systems biology : Open access","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1158/1538-7445.AM2021-176","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Epitopes are peptides that present on the surface of the cell and can be recognized by immune cells to initiate the immune response. Identification of neoepitopes – tumor-specific, MHC-bound epitopes recognized specifically by T-cells – is valuable for predicting response to immunotherapies, including checkpoint blockade therapies. Tumors with more neoepitopes tend to be more responsive to immune checkpoint therapies compared to tumors with fewer neoepitopes. ProTECT is a previously published computational method that uses Illumina whole genome and transcriptome sequencing data from tumor and matched normal tissues to identify neoepitopes. Tumor and normal whole genome sequencing data are used to infer a patient9s HLA haplotypes, as well as annotate variants as either somatic or germline. While whole genome sequencing is comprehensive, it is quite costly and not available for many samples. Here we adapt ProTECT to use only tumor RNA sequencing data and HLA haplotype information available to the clinician to identify neoepitopes in a tumor sample. Prior to running ProTECT, we use the computational tools Opossum and Platypus for variant calling instead of Radia (which is designed for variant calling using both RNA and DNA sequencing data as input). To determine which variants are somatic and therefore could represent tumor neoepitopes, variants found in RNA are compared to a panel of normals, for example the Genome Aggregation Database (gnomAD; containing variants from 125,748 exome sequences and 15,708 whole-genome sequences). With the resulting somatic variants and the HLA type, ProTECT proceeds as usual, with translation of variants into proteins, MHC:Peptide binding predictions and neoepitope ranking. We find that high quality neoepitopes are identifiable using an RNA-only approach, when genomic data is absent. Future work will validate the sensitivity of our method by benchmarking it against the original ProTECT predictions in the TCGA Prostate Adenocarcinoma cohort. Citation Format: Drew Thompson, Olena M. Vaske, Arjun Rao, Holly C. Beale. Detecting neoepitopes from tumor RNA sequencing datasets [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 176.

查看原文本刊更多论文

摘要:从肿瘤RNA测序数据集中检测新表位

表位是存在于细胞表面的多肽，可以被免疫细胞识别并启动免疫反应。鉴定新表位-肿瘤特异性，由t细胞特异性识别的mhc结合表位-对于预测免疫疗法的反应是有价值的，包括检查点阻断疗法。与新表位较少的肿瘤相比，具有更多新表位的肿瘤往往对免疫检查点治疗更有反应。ProTECT是一种先前发表的计算方法，它使用来自肿瘤和匹配正常组织的Illumina全基因组和转录组测序数据来识别新表位。肿瘤和正常全基因组测序数据用于推断患者的HLA单倍型，以及注释体细胞或种系变异。虽然全基因组测序是全面的，但它非常昂贵，而且许多样本无法获得。在这里，我们使ProTECT仅使用肿瘤RNA测序数据和临床医生可用的HLA单倍型信息来识别肿瘤样本中的新表位。在运行ProTECT之前，我们使用计算工具possum和鸭嘴兽来调用变体，而不是Radia (Radia是设计用于使用RNA和DNA测序数据作为输入的变体调用)。为了确定哪些变异是体细胞的，因此可能代表肿瘤新表位，将RNA中发现的变异与一组正常的变异进行比较，例如基因组聚集数据库(gnomAD;包含125,748个外显子组序列和15,708个全基因组序列的变体)。根据产生的体细胞变异和HLA类型，ProTECT照常进行，将变异翻译成蛋白质，MHC:肽结合预测和新表位排序。我们发现，在基因组数据缺失的情况下，高质量的新表位可以通过仅使用rna的方法进行识别。未来的工作将通过对照TCGA前列腺腺癌队列中原始的ProTECT预测来验证我们方法的敏感性。引用格式:Drew Thompson, Olena M. Vaske, Arjun Rao, Holly C. Beale。从肿瘤RNA测序数据集中检测新表位[摘要]。见:美国癌症研究协会2021年年会论文集;2021年4月10日至15日和5月17日至21日。费城(PA): AACR;癌症杂志，2021;81(13 -增刊):摘要nr 176。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of bioinformatics and systems biology : Open access

自引率

0.00%

发文量