摘要B042:利用单等位基因质谱技术对92个常见HLA等位基因的HLA I类表位结合进行广泛分析和更准确的预测

Convergence of Technology and Cancer Immunotherapy Pub Date : 2019-02-01 DOI:10.1158/2326-6074.CRICIMTEATIAACR18-B042

Siranush Sarkizova, Susan Klaeger, D. Keskin, K. Clauser, Hasmik Keshishian, Christina R. Hartigan, N. Hacohen, S. Carr, Catherine J. Wu

{"title":"摘要B042:利用单等位基因质谱技术对92个常见HLA等位基因的HLA I类表位结合进行广泛分析和更准确的预测","authors":"Siranush Sarkizova, Susan Klaeger, D. Keskin, K. Clauser, Hasmik Keshishian, Christina R. Hartigan, N. Hacohen, S. Carr, Catherine J. Wu","doi":"10.1158/2326-6074.CRICIMTEATIAACR18-B042","DOIUrl":null,"url":null,"abstract":"Introduction: Cancer vaccine therapies rely on accurate personalized selection of immunizing peptides in order to potentiate tumor-specific immune responses against neoepitopes derived from somatic mutations. Given the unique accumulation of mutations in each tumor as well as the patient’s particular complement of HLA class I alleles, the ability to accurately predict which epitopes will be presented by tumor cells is a fundamental prerequisite for successful vaccine design. By utilizing a mono-allelic mass spectrometry (MS) strategy for profiling the endogenous HLA class I peptidome, we recently showed that prediction of endogenous presentation can be drastically improved when model training integrates peptide sequence along with intracellular signals such as likelihood of proteasomal processing and peptide abundance. Yet the limited set of mono-allelic data did not allow for deep comparative analysis across HLA- A, B, and C alleles, which can better inform pan-allele predictor design. Moreover, the significant variability in per-allele model performance remains unexplained. Methods: We recently developed a scalable mono-allelic MS technique to profile naturally presented peptides on HLA molecules, whereby the HLA class I deficient B721.221 cell line is transfected with HLA expression vectors coding for a single allele of interest and eluted HLA peptides are analyzed by LC-MS/MS. In addition, endogenously presented antigens on primary tumor-derived cell lines from 4 melanoma patients were also identified via MS. To extract knowledge from this unique dataset, we implemented computational tools to summarize, visualize, and compare the characteristics of HLA- A, B, C, and G alleles and developed a novel approach to define allele similarity that takes into account the collection of sub-motifs per allele. We trained neural network prediction models, validated their performance on internal and external datasets, and analyzed the variability in performance across alleles. Results: To date, we have generated binding data for 92 HLA- A, B, C and G alleles, identifying more than 190,000 peptides and covering the most frequent alleles in the population. Extensive mono-allelic profiling revealed that some alleles present non-9-mer peptides with high frequency. The availability of large number of non-9-mer peptides allowed us to build length-specific models that often performed better than the corresponding non-length-specific models currently used. We observe that HLA- A and B alleles present more peptides of length 10 and 11 than C alleles, while C alleles have a higher propensity for 8-mers. Correlation-based analysis of binding motifs revealed that HLA-A and B motifs are more specific whereas C motifs are less stringent and thus share more overlapping binders. Since binding data are available only for a fraction of all known alleles, pan-allele models implicitly embed allele similarity to predict for uncharacterized alleles based on the sequence of the binding pocket. By clustering allele-specific peptides into sub-motifs, we propose a novel explicit approach to delineate allele similarity at finer granularity that can improve pan-allele model design. We show that our allele-specific models are better at discriminating tumor-presented epitopes than state of the art predictors and investigate the relationship between false discovery rate and natural abundance of anchor residues to better understand differences in model accuracy amongst alleles. Finally, deconvolution of tumor-presented peptides demonstrated that ~10% of peptides are presented on HLA-C, which has been historically understudied. Conclusions: We have vastly expanded the collection of endogenous HLA-specific peptides deriving biologic insights into the principles of epitope presentations and valuable considerations for prediction model design and epitope selection for tumor vaccines. Citation Format: Siranush Sarkizova, Susan Klaeger, Derin B. Keskin, Karl Clauser, Hasmik Keshishian, Christina R. Hartigan, Nir Hacohen, Steven A. Carr, Catherine J. Wu. Broad analysis and more accurate predictions of HLA class I epitope binding in 92 common HLA alleles profiled by mono-allelic mass spectrometry [abstract]. In: Proceedings of the Fourth CRI-CIMT-EATI-AACR International Cancer Immunotherapy Conference: Translating Science into Survival; Sept 30-Oct 3, 2018; New York, NY. Philadelphia (PA): AACR; Cancer Immunol Res 2019;7(2 Suppl):Abstract nr B042.","PeriodicalId":352838,"journal":{"name":"Convergence of Technology and Cancer Immunotherapy","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Abstract B042: Broad analysis and more accurate predictions of HLA class I epitope binding in 92 common HLA alleles profiled by mono-allelic mass spectrometry\",\"authors\":\"Siranush Sarkizova, Susan Klaeger, D. Keskin, K. Clauser, Hasmik Keshishian, Christina R. Hartigan, N. Hacohen, S. Carr, Catherine J. Wu\",\"doi\":\"10.1158/2326-6074.CRICIMTEATIAACR18-B042\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Introduction: Cancer vaccine therapies rely on accurate personalized selection of immunizing peptides in order to potentiate tumor-specific immune responses against neoepitopes derived from somatic mutations. Given the unique accumulation of mutations in each tumor as well as the patient’s particular complement of HLA class I alleles, the ability to accurately predict which epitopes will be presented by tumor cells is a fundamental prerequisite for successful vaccine design. By utilizing a mono-allelic mass spectrometry (MS) strategy for profiling the endogenous HLA class I peptidome, we recently showed that prediction of endogenous presentation can be drastically improved when model training integrates peptide sequence along with intracellular signals such as likelihood of proteasomal processing and peptide abundance. Yet the limited set of mono-allelic data did not allow for deep comparative analysis across HLA- A, B, and C alleles, which can better inform pan-allele predictor design. Moreover, the significant variability in per-allele model performance remains unexplained. Methods: We recently developed a scalable mono-allelic MS technique to profile naturally presented peptides on HLA molecules, whereby the HLA class I deficient B721.221 cell line is transfected with HLA expression vectors coding for a single allele of interest and eluted HLA peptides are analyzed by LC-MS/MS. In addition, endogenously presented antigens on primary tumor-derived cell lines from 4 melanoma patients were also identified via MS. To extract knowledge from this unique dataset, we implemented computational tools to summarize, visualize, and compare the characteristics of HLA- A, B, C, and G alleles and developed a novel approach to define allele similarity that takes into account the collection of sub-motifs per allele. We trained neural network prediction models, validated their performance on internal and external datasets, and analyzed the variability in performance across alleles. Results: To date, we have generated binding data for 92 HLA- A, B, C and G alleles, identifying more than 190,000 peptides and covering the most frequent alleles in the population. Extensive mono-allelic profiling revealed that some alleles present non-9-mer peptides with high frequency. The availability of large number of non-9-mer peptides allowed us to build length-specific models that often performed better than the corresponding non-length-specific models currently used. We observe that HLA- A and B alleles present more peptides of length 10 and 11 than C alleles, while C alleles have a higher propensity for 8-mers. Correlation-based analysis of binding motifs revealed that HLA-A and B motifs are more specific whereas C motifs are less stringent and thus share more overlapping binders. Since binding data are available only for a fraction of all known alleles, pan-allele models implicitly embed allele similarity to predict for uncharacterized alleles based on the sequence of the binding pocket. By clustering allele-specific peptides into sub-motifs, we propose a novel explicit approach to delineate allele similarity at finer granularity that can improve pan-allele model design. We show that our allele-specific models are better at discriminating tumor-presented epitopes than state of the art predictors and investigate the relationship between false discovery rate and natural abundance of anchor residues to better understand differences in model accuracy amongst alleles. Finally, deconvolution of tumor-presented peptides demonstrated that ~10% of peptides are presented on HLA-C, which has been historically understudied. Conclusions: We have vastly expanded the collection of endogenous HLA-specific peptides deriving biologic insights into the principles of epitope presentations and valuable considerations for prediction model design and epitope selection for tumor vaccines. Citation Format: Siranush Sarkizova, Susan Klaeger, Derin B. Keskin, Karl Clauser, Hasmik Keshishian, Christina R. Hartigan, Nir Hacohen, Steven A. Carr, Catherine J. Wu. Broad analysis and more accurate predictions of HLA class I epitope binding in 92 common HLA alleles profiled by mono-allelic mass spectrometry [abstract]. In: Proceedings of the Fourth CRI-CIMT-EATI-AACR International Cancer Immunotherapy Conference: Translating Science into Survival; Sept 30-Oct 3, 2018; New York, NY. Philadelphia (PA): AACR; Cancer Immunol Res 2019;7(2 Suppl):Abstract nr B042.\",\"PeriodicalId\":352838,\"journal\":{\"name\":\"Convergence of Technology and Cancer Immunotherapy\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Convergence of Technology and Cancer Immunotherapy\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1158/2326-6074.CRICIMTEATIAACR18-B042\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Convergence of Technology and Cancer Immunotherapy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1158/2326-6074.CRICIMTEATIAACR18-B042","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

癌症疫苗治疗依赖于准确的个性化免疫肽选择，以增强针对源自体细胞突变的新表位的肿瘤特异性免疫反应。鉴于每种肿瘤中突变的独特积累以及患者HLA I类等位基因的特殊补体，准确预测肿瘤细胞将呈现哪些表位的能力是成功设计疫苗的基本先决条件。通过利用单等位基因质谱(MS)策略分析内源性HLA I类肽球，我们最近发现，当模型训练整合肽序列以及细胞内信号(如蛋白酶体加工的可能性和肽丰度)时，内源性递呈的预测可以大大提高。然而，有限的单等位基因数据集无法对HLA- A、B和C等位基因进行深入的比较分析，这可以更好地为泛等位基因预测设计提供信息。此外，每个等位基因模型性能的显著变异性仍未得到解释。方法:我们最近开发了一种可扩展的单等位基因质谱技术来分析HLA分子上天然存在的肽，通过将编码感兴趣的单个等位基因的HLA表达载体转染HLA I类缺陷的B721.221细胞系，并用LC-MS/MS分析洗脱的HLA肽。此外，我们还通过ms鉴定了4例黑色素瘤患者原发肿瘤来源细胞系上的内源性抗原。为了从这个独特的数据集中提取知识，我们使用了计算工具来总结、可视化和比较HLA- A、B、C和G等位基因的特征，并开发了一种新的方法来定义等位基因相似性，该方法考虑了每个等位基因的亚基序的收集。我们训练了神经网络预测模型，在内部和外部数据集上验证了它们的性能，并分析了不同等位基因的性能变异性。结果:迄今为止，我们已经生成了92个HLA- A、B、C和G等位基因的结合数据，鉴定了超过190,000个肽，覆盖了人群中最常见的等位基因。广泛的单等位基因分析显示，一些等位基因以高频率呈现非9-聚肽。大量非9-mer肽的可用性使我们能够建立长度特异性模型，通常比目前使用的相应非长度特异性模型表现更好。我们观察到HLA- A和B等位基因比C等位基因存在更多长度为10和11的肽，而C等位基因更倾向于8-mers。基于相关性的结合基序分析显示，HLA-A和B基序的特异性更高，而C基序的特异性较低，因此具有更多重叠的结合基序。由于结合数据仅适用于所有已知等位基因的一小部分，因此泛等位基因模型隐含嵌入等位基因相似性，以基于结合口袋序列预测未表征的等位基因。通过将等位基因特异性肽聚类到子基序中，我们提出了一种新的明确方法，可以在更细的粒度上描述等位基因相似性，从而改进泛等位基因模型的设计。我们表明，我们的等位基因特异性模型在区分肿瘤呈现的表位方面比最先进的预测器更好，并研究了错误发现率与锚定残基自然丰度之间的关系，以更好地理解等位基因之间模型准确性的差异。最后，肿瘤呈递肽的反褶积表明，约10%的肽呈递在HLA-C上，这在历史上一直没有得到充分的研究。结论:我们已经大大扩展了内源性hla特异性肽的收集，从生物学角度深入了解表位呈现的原理，并为肿瘤疫苗的预测模型设计和表位选择提供了有价值的考虑。引文格式:Siranush Sarkizova, Susan Klaeger, Derin B. Keskin, Karl Clauser, Hasmik Keshishian, Christina R. Hartigan, Nir Hacohen, Steven A. Carr, Catherine J. Wu单等位基因质谱法对92个常见HLA等位基因的HLA I类表位结合进行广泛分析和更准确的预测[摘要]。第四届CRI-CIMT-EATI-AACR国际癌症免疫治疗会议:将科学转化为生存;2018年9月30日至10月3日;纽约，纽约。费城(PA): AACR;癌症免疫学杂志2019;7(2增刊):摘要nr B042。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Abstract B042: Broad analysis and more accurate predictions of HLA class I epitope binding in 92 common HLA alleles profiled by mono-allelic mass spectrometry

Introduction: Cancer vaccine therapies rely on accurate personalized selection of immunizing peptides in order to potentiate tumor-specific immune responses against neoepitopes derived from somatic mutations. Given the unique accumulation of mutations in each tumor as well as the patient’s particular complement of HLA class I alleles, the ability to accurately predict which epitopes will be presented by tumor cells is a fundamental prerequisite for successful vaccine design. By utilizing a mono-allelic mass spectrometry (MS) strategy for profiling the endogenous HLA class I peptidome, we recently showed that prediction of endogenous presentation can be drastically improved when model training integrates peptide sequence along with intracellular signals such as likelihood of proteasomal processing and peptide abundance. Yet the limited set of mono-allelic data did not allow for deep comparative analysis across HLA- A, B, and C alleles, which can better inform pan-allele predictor design. Moreover, the significant variability in per-allele model performance remains unexplained. Methods: We recently developed a scalable mono-allelic MS technique to profile naturally presented peptides on HLA molecules, whereby the HLA class I deficient B721.221 cell line is transfected with HLA expression vectors coding for a single allele of interest and eluted HLA peptides are analyzed by LC-MS/MS. In addition, endogenously presented antigens on primary tumor-derived cell lines from 4 melanoma patients were also identified via MS. To extract knowledge from this unique dataset, we implemented computational tools to summarize, visualize, and compare the characteristics of HLA- A, B, C, and G alleles and developed a novel approach to define allele similarity that takes into account the collection of sub-motifs per allele. We trained neural network prediction models, validated their performance on internal and external datasets, and analyzed the variability in performance across alleles. Results: To date, we have generated binding data for 92 HLA- A, B, C and G alleles, identifying more than 190,000 peptides and covering the most frequent alleles in the population. Extensive mono-allelic profiling revealed that some alleles present non-9-mer peptides with high frequency. The availability of large number of non-9-mer peptides allowed us to build length-specific models that often performed better than the corresponding non-length-specific models currently used. We observe that HLA- A and B alleles present more peptides of length 10 and 11 than C alleles, while C alleles have a higher propensity for 8-mers. Correlation-based analysis of binding motifs revealed that HLA-A and B motifs are more specific whereas C motifs are less stringent and thus share more overlapping binders. Since binding data are available only for a fraction of all known alleles, pan-allele models implicitly embed allele similarity to predict for uncharacterized alleles based on the sequence of the binding pocket. By clustering allele-specific peptides into sub-motifs, we propose a novel explicit approach to delineate allele similarity at finer granularity that can improve pan-allele model design. We show that our allele-specific models are better at discriminating tumor-presented epitopes than state of the art predictors and investigate the relationship between false discovery rate and natural abundance of anchor residues to better understand differences in model accuracy amongst alleles. Finally, deconvolution of tumor-presented peptides demonstrated that ~10% of peptides are presented on HLA-C, which has been historically understudied. Conclusions: We have vastly expanded the collection of endogenous HLA-specific peptides deriving biologic insights into the principles of epitope presentations and valuable considerations for prediction model design and epitope selection for tumor vaccines. Citation Format: Siranush Sarkizova, Susan Klaeger, Derin B. Keskin, Karl Clauser, Hasmik Keshishian, Christina R. Hartigan, Nir Hacohen, Steven A. Carr, Catherine J. Wu. Broad analysis and more accurate predictions of HLA class I epitope binding in 92 common HLA alleles profiled by mono-allelic mass spectrometry [abstract]. In: Proceedings of the Fourth CRI-CIMT-EATI-AACR International Cancer Immunotherapy Conference: Translating Science into Survival; Sept 30-Oct 3, 2018; New York, NY. Philadelphia (PA): AACR; Cancer Immunol Res 2019;7(2 Suppl):Abstract nr B042.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Convergence of Technology and Cancer Immunotherapy

自引率

0.00%

发文量