pMHChat, characterizing the interactions between major histocompatibility complex class II molecules and peptides with large language models and deep hypergraph learning.
Jiani Ma, Zhikang Wang, Cen Tong, Qi Yang, Lin Zhang, Hui Liu
{"title":"pMHChat, characterizing the interactions between major histocompatibility complex class II molecules and peptides with large language models and deep hypergraph learning.","authors":"Jiani Ma, Zhikang Wang, Cen Tong, Qi Yang, Lin Zhang, Hui Liu","doi":"10.1093/bib/bbaf321","DOIUrl":null,"url":null,"abstract":"<p><p>Characterizing the binding interactions between major histocompatibility complex (MHC) class II molecules and peptides is crucial for studying the immune system, offering potential applications for neoantigen design, vaccine development, and personalized immunotherapy. Motivated by this profound meaning, we developed a model that integrates large language models (LLMs) and deep hypergraph learning for predicting MHC class II-peptide binding reactivity, affinity, and residue contact profiling. pMHChat takes MHC pseudo-sequences and peptide sequences as inputs and processes them through four stages: LLMs fine-tune stage, feature encoding and map fusion stage, task-specific prediction stage, and downstream analysis stage. pMHChat distinguishes itself in capturing contextually relevant and high-order spatial interactions of the peptide-MHC (pMHC) complex. Specifically, in a five-fold cross-validation experiment, pMHChat achieves superior performance, with a mean area under the receiver operating characteristic curve of 0.8744 and an area under the precision-recall curve of 0.8390 in the binding reactivity task, as well as a mean Pearson correlation coefficient of 0.7311 in the binding affinity prediction task. Furthermore, pMHChat also demonstrates the best performance in both the leave-one-molecule-out setting and independent evaluation. Notably, pMHChat can provide residue contact profiling, showing its potential application in recognizing critical binding patterns of the pMHC complex. Our findings highlight pMHChat's capacity to advance both predictive accuracy and detailed insights into the MHC-peptide binding process. We anticipate that pMHChat will serve as a powerful tool for elucidating MHC-peptide interactions, with promising applications in immunological research and therapeutic development.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 4","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12229989/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbaf321","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Characterizing the binding interactions between major histocompatibility complex (MHC) class II molecules and peptides is crucial for studying the immune system, offering potential applications for neoantigen design, vaccine development, and personalized immunotherapy. Motivated by this profound meaning, we developed a model that integrates large language models (LLMs) and deep hypergraph learning for predicting MHC class II-peptide binding reactivity, affinity, and residue contact profiling. pMHChat takes MHC pseudo-sequences and peptide sequences as inputs and processes them through four stages: LLMs fine-tune stage, feature encoding and map fusion stage, task-specific prediction stage, and downstream analysis stage. pMHChat distinguishes itself in capturing contextually relevant and high-order spatial interactions of the peptide-MHC (pMHC) complex. Specifically, in a five-fold cross-validation experiment, pMHChat achieves superior performance, with a mean area under the receiver operating characteristic curve of 0.8744 and an area under the precision-recall curve of 0.8390 in the binding reactivity task, as well as a mean Pearson correlation coefficient of 0.7311 in the binding affinity prediction task. Furthermore, pMHChat also demonstrates the best performance in both the leave-one-molecule-out setting and independent evaluation. Notably, pMHChat can provide residue contact profiling, showing its potential application in recognizing critical binding patterns of the pMHC complex. Our findings highlight pMHChat's capacity to advance both predictive accuracy and detailed insights into the MHC-peptide binding process. We anticipate that pMHChat will serve as a powerful tool for elucidating MHC-peptide interactions, with promising applications in immunological research and therapeutic development.
期刊介绍:
Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data.
The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.