Laurie Prélot, Jiayu Chen, Matthias Hüser, André Kahles, Gunnar Rätsch
{"title":"ImmunoPepper: Extracting personalized peptides from complex splicing graphs.","authors":"Laurie Prélot, Jiayu Chen, Matthias Hüser, André Kahles, Gunnar Rätsch","doi":"10.1093/bioinformatics/btaf492","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>RNA Sequencing enables the characterization of a cell's transcript isoforms in healthy and disease conditions. In the context of cancer, local transcript variability may translate to splicing-derived tumor-associated peptides recognized by the immune system. A software tool that extracts such candidate peptides, is of great interest for personalized cancer therapy.</p><p><strong>Results: </strong>We present the open-source software tool ImmunoPepper, which extracts a set of biologically plausible peptides from a splicing graph, derived from a set of RNA-Seq datasets. This peptide set can be personalized with germline and somatic variation and takes novel RNA splice variants into account. ImmunoPepper supports several filtering options, including subtraction of normal tissue background, prediction of MHC-binding affinity, as well as MassSpec-based validation of identified peptides. We analyzed 32 ovarian cancer (TCGA-OV) and 31 breast invasive carcinoma (TCGA-BRCA) samples, with a strict cancer-specific filtering configuration, and obtained on average 834 and 569 cancer-specific predicted MHC-I binding 9-mers per sample, for each cohort, respectively. MassSpec validation with the target-decoy competition Subset-Neighbor-Search (SNS) showed an average validation rate of 4.5% per TCGA-OV sample and 5.3% per TCGA-BRCA sample. This corresponded to 25 MHC-I binders 9-mers per TCGA-OV sample, and 20 MHC-I binders 9-mers per TCGA-BRCA sample in average. Finally, we draw conclusions about the best framework for generation of splicing-derived neoepitopes and recommend to use joint data structures when processing homogeneously a cancer and a normal cohort and to focus on reproducibility of the candidates across generation pipelines.</p><p><strong>Availability: </strong>ImmunoPepper is implemented in Python 3 and is available as open source software at https://github.com/ratschlab/immunopepper. The online documentation can be found at https://immunopepper.readthedocs.io/en/latest/.</p><p><strong>Supplementary information: </strong>Supplementary data are available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: RNA Sequencing enables the characterization of a cell's transcript isoforms in healthy and disease conditions. In the context of cancer, local transcript variability may translate to splicing-derived tumor-associated peptides recognized by the immune system. A software tool that extracts such candidate peptides, is of great interest for personalized cancer therapy.
Results: We present the open-source software tool ImmunoPepper, which extracts a set of biologically plausible peptides from a splicing graph, derived from a set of RNA-Seq datasets. This peptide set can be personalized with germline and somatic variation and takes novel RNA splice variants into account. ImmunoPepper supports several filtering options, including subtraction of normal tissue background, prediction of MHC-binding affinity, as well as MassSpec-based validation of identified peptides. We analyzed 32 ovarian cancer (TCGA-OV) and 31 breast invasive carcinoma (TCGA-BRCA) samples, with a strict cancer-specific filtering configuration, and obtained on average 834 and 569 cancer-specific predicted MHC-I binding 9-mers per sample, for each cohort, respectively. MassSpec validation with the target-decoy competition Subset-Neighbor-Search (SNS) showed an average validation rate of 4.5% per TCGA-OV sample and 5.3% per TCGA-BRCA sample. This corresponded to 25 MHC-I binders 9-mers per TCGA-OV sample, and 20 MHC-I binders 9-mers per TCGA-BRCA sample in average. Finally, we draw conclusions about the best framework for generation of splicing-derived neoepitopes and recommend to use joint data structures when processing homogeneously a cancer and a normal cohort and to focus on reproducibility of the candidates across generation pipelines.
Availability: ImmunoPepper is implemented in Python 3 and is available as open source software at https://github.com/ratschlab/immunopepper. The online documentation can be found at https://immunopepper.readthedocs.io/en/latest/.
Supplementary information: Supplementary data are available at Bioinformatics online.