OnionMHC: A deep learning model for peptide — HLA-A*02:01 binding predictions using both structure and sequence feature sets

Q3 Engineering
Shikhar Saxena, Sambhavi Animesh, M. Fullwood, Y. Mu
{"title":"OnionMHC: A deep learning model for peptide — HLA-A*02:01 binding predictions using both structure and sequence feature sets","authors":"Shikhar Saxena, Sambhavi Animesh, M. Fullwood, Y. Mu","doi":"10.1142/s2424913020500095","DOIUrl":null,"url":null,"abstract":"The peptide binding to Major Histocompatibility Complex (MHC) proteins is an important step in the antigen-presentation pathway. Thus, predicting the binding potential of peptides with MHC is essential for the design of peptide-based therapeutics. Most of the available machine learning-based models predict the peptide-MHC binding based on the sequence of amino acids alone. Given the importance of structural information in determining the stability of the complex, here we have utilized both the complex structure and the peptide sequence features to predict the binding affinity of peptides to human receptor HLA-A*02:01. To our knowledge, no such model has been developed for the human HLA receptor before that incorporates both structure and sequence-based features. Results: We have applied machine learning techniques through the natural language processing (NLP) and convolutional neural network to design a model that performs comparably with the existing state-of-the-art models. Our model shows that the information from both sequence and structure domains results in enhanced performance in the binding prediction compared to the information from one domain alone. The testing results in 18 weekly benchmark datasets provided by the Immune Epitope Database (IEDB) as well as experimentally validated peptides from the whole-exome sequencing analysis of the breast cancer patients indicate that our model has achieved state-of-the-art performance. Conclusion: We have developed a deep-learning model (OnionMHC) that incorporates both structure as well as sequence-based features to predict the binding affinity of peptides with human receptor HLA-A*02:01. The model demonstrates state-of-the-art performance on the IEDB benchmark dataset as well as the experimentally validated peptides. The model can be used in the screening of potential neo-epitopes for the development of cancer vaccines or designing peptides for peptide-based therapeutics. OnionMHC is freely available at https://github.com/shikhar249/OnionMHC .","PeriodicalId":36070,"journal":{"name":"Journal of Micromechanics and Molecular Physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Micromechanics and Molecular Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s2424913020500095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 2

Abstract

The peptide binding to Major Histocompatibility Complex (MHC) proteins is an important step in the antigen-presentation pathway. Thus, predicting the binding potential of peptides with MHC is essential for the design of peptide-based therapeutics. Most of the available machine learning-based models predict the peptide-MHC binding based on the sequence of amino acids alone. Given the importance of structural information in determining the stability of the complex, here we have utilized both the complex structure and the peptide sequence features to predict the binding affinity of peptides to human receptor HLA-A*02:01. To our knowledge, no such model has been developed for the human HLA receptor before that incorporates both structure and sequence-based features. Results: We have applied machine learning techniques through the natural language processing (NLP) and convolutional neural network to design a model that performs comparably with the existing state-of-the-art models. Our model shows that the information from both sequence and structure domains results in enhanced performance in the binding prediction compared to the information from one domain alone. The testing results in 18 weekly benchmark datasets provided by the Immune Epitope Database (IEDB) as well as experimentally validated peptides from the whole-exome sequencing analysis of the breast cancer patients indicate that our model has achieved state-of-the-art performance. Conclusion: We have developed a deep-learning model (OnionMHC) that incorporates both structure as well as sequence-based features to predict the binding affinity of peptides with human receptor HLA-A*02:01. The model demonstrates state-of-the-art performance on the IEDB benchmark dataset as well as the experimentally validated peptides. The model can be used in the screening of potential neo-epitopes for the development of cancer vaccines or designing peptides for peptide-based therapeutics. OnionMHC is freely available at https://github.com/shikhar249/OnionMHC .
OnionMHC:一个使用结构和序列特征集进行肽- HLA-A*02:01结合预测的深度学习模型
肽与主要组织相容性复合体(MHC)蛋白的结合是抗原呈递途径中的重要步骤。因此,预测肽与MHC的结合潜力对于设计基于肽的治疗方法至关重要。大多数可用的基于机器学习的模型仅基于氨基酸序列来预测肽MHC结合。鉴于结构信息在确定复合物稳定性中的重要性,我们利用复合物结构和肽序列特征来预测肽与人类受体HLA-A*02:01的结合亲和力。据我们所知,以前还没有为人类HLA受体开发出同时包含基于结构和序列的特征的模型。结果:我们通过自然语言处理(NLP)和卷积神经网络应用了机器学习技术,设计了一个与现有最先进模型性能相当的模型。我们的模型表明,与单独来自一个域的信息相比,来自序列和结构域的信息导致结合预测的性能增强。免疫表位数据库(IEDB)提供的18周基准数据集的测试结果以及来自癌症患者全基因组测序分析的实验验证肽表明,我们的模型已经达到了最先进的性能。结论:我们开发了一个深度学习模型(OnionMHC),该模型结合了基于结构和序列的特征来预测肽与人类受体HLA-a*02:01的结合亲和力。该模型在IEDB基准数据集以及实验验证的肽上展示了最先进的性能。该模型可用于筛选用于开发癌症疫苗的潜在新表位或设计用于基于肽的治疗的肽。OnionMHC在https://github.com/shikhar249/OnionMHC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Micromechanics and Molecular Physics
Journal of Micromechanics and Molecular Physics Materials Science-Polymers and Plastics
CiteScore
3.30
自引率
0.00%
发文量
27
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信