pMHChat, characterizing the interactions between major histocompatibility complex class II molecules and peptides with large language models and deep hypergraph learning.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS
Jiani Ma, Zhikang Wang, Cen Tong, Qi Yang, Lin Zhang, Hui Liu
{"title":"pMHChat, characterizing the interactions between major histocompatibility complex class II molecules and peptides with large language models and deep hypergraph learning.","authors":"Jiani Ma, Zhikang Wang, Cen Tong, Qi Yang, Lin Zhang, Hui Liu","doi":"10.1093/bib/bbaf321","DOIUrl":null,"url":null,"abstract":"<p><p>Characterizing the binding interactions between major histocompatibility complex (MHC) class II molecules and peptides is crucial for studying the immune system, offering potential applications for neoantigen design, vaccine development, and personalized immunotherapy. Motivated by this profound meaning, we developed a model that integrates large language models (LLMs) and deep hypergraph learning for predicting MHC class II-peptide binding reactivity, affinity, and residue contact profiling. pMHChat takes MHC pseudo-sequences and peptide sequences as inputs and processes them through four stages: LLMs fine-tune stage, feature encoding and map fusion stage, task-specific prediction stage, and downstream analysis stage. pMHChat distinguishes itself in capturing contextually relevant and high-order spatial interactions of the peptide-MHC (pMHC) complex. Specifically, in a five-fold cross-validation experiment, pMHChat achieves superior performance, with a mean area under the receiver operating characteristic curve of 0.8744 and an area under the precision-recall curve of 0.8390 in the binding reactivity task, as well as a mean Pearson correlation coefficient of 0.7311 in the binding affinity prediction task. Furthermore, pMHChat also demonstrates the best performance in both the leave-one-molecule-out setting and independent evaluation. Notably, pMHChat can provide residue contact profiling, showing its potential application in recognizing critical binding patterns of the pMHC complex. Our findings highlight pMHChat's capacity to advance both predictive accuracy and detailed insights into the MHC-peptide binding process. We anticipate that pMHChat will serve as a powerful tool for elucidating MHC-peptide interactions, with promising applications in immunological research and therapeutic development.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12229989/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf321","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Characterizing the binding interactions between major histocompatibility complex (MHC) class II molecules and peptides is crucial for studying the immune system, offering potential applications for neoantigen design, vaccine development, and personalized immunotherapy. Motivated by this profound meaning, we developed a model that integrates large language models (LLMs) and deep hypergraph learning for predicting MHC class II-peptide binding reactivity, affinity, and residue contact profiling. pMHChat takes MHC pseudo-sequences and peptide sequences as inputs and processes them through four stages: LLMs fine-tune stage, feature encoding and map fusion stage, task-specific prediction stage, and downstream analysis stage. pMHChat distinguishes itself in capturing contextually relevant and high-order spatial interactions of the peptide-MHC (pMHC) complex. Specifically, in a five-fold cross-validation experiment, pMHChat achieves superior performance, with a mean area under the receiver operating characteristic curve of 0.8744 and an area under the precision-recall curve of 0.8390 in the binding reactivity task, as well as a mean Pearson correlation coefficient of 0.7311 in the binding affinity prediction task. Furthermore, pMHChat also demonstrates the best performance in both the leave-one-molecule-out setting and independent evaluation. Notably, pMHChat can provide residue contact profiling, showing its potential application in recognizing critical binding patterns of the pMHC complex. Our findings highlight pMHChat's capacity to advance both predictive accuracy and detailed insights into the MHC-peptide binding process. We anticipate that pMHChat will serve as a powerful tool for elucidating MHC-peptide interactions, with promising applications in immunological research and therapeutic development.

pMHChat,通过大型语言模型和深度超图学习表征主要组织相容性复合体II类分子和多肽之间的相互作用。
表征主要组织相容性复合体(MHC) II类分子和多肽之间的结合相互作用对于研究免疫系统至关重要,为新抗原设计、疫苗开发和个性化免疫治疗提供了潜在的应用。受这一深刻意义的启发,我们开发了一个模型,该模型集成了大型语言模型(llm)和深度超图学习,用于预测MHC ii类肽结合反应性、亲和力和残基接触谱。pMHChat以MHC伪序列和肽序列为输入,经过llm微调阶段、特征编码和图谱融合阶段、特定任务预测阶段和下游分析阶段四个阶段进行处理。pMHChat在捕获肽- mhc (pMHC)复合物的上下文相关和高阶空间相互作用方面具有独特之处。具体而言,在五重交叉验证实验中,pMHChat取得了优异的性能,在结合反应性任务中,其接受者工作特征曲线下的平均面积为0.8744,在精确召回率曲线下的平均面积为0.8390,在结合亲和性预测任务中,其Pearson相关系数均值为0.7311。此外,pMHChat在单分子离开设置和独立评估中也表现出最佳性能。值得注意的是,pMHChat可以提供残基接触谱,显示其在识别pMHC复合物的关键结合模式方面的潜在应用。我们的研究结果强调了pMHChat在提高预测准确性和深入了解mhc肽结合过程方面的能力。我们预计pMHChat将成为阐明mhc肽相互作用的有力工具,在免疫学研究和治疗开发中具有广阔的应用前景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信