{"title":"Estimating protein complex model accuracy using graph transformers and pairwise similarity graphs.","authors":"Jian Liu, Pawan Neupane, Jianlin Cheng","doi":"10.1093/bioadv/vbaf180","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Estimation of protein complex structure accuracy is essential for effective structural model selection in structural biology applications such as protein function analysis and drug design. Despite the success of structure prediction methods such as AlphaFold2 and AlphaFold3, selecting top-quality structural models from large model pools remains challenging.</p><p><strong>Results: </strong>We present GATE, a novel method that uses graph transformers on pairwise model similarity graphs to predict the quality (accuracy) of complex structural models. By integrating single-model and multimodel quality features, GATE captures intrinsic model characteristics and intermodel geometric similarities to make robust predictions. On the dataset of the 15th Critical Assessment of Protein Structure Prediction (CASP15), GATE achieved the highest Pearson's correlation (0.748) and the lowest ranking loss (0.1191) compared with existing methods. In the blind CASP16 experiment, GATE ranked fifth based on the sum of z-scores, with a Pearson's correlation of 0.7076 (first), a Spearman's correlation of 0.4514 (fourth), a ranking loss of 0.1221 (third), and an area under the curve score of 0.6680 (third) on per-target TM-score-based metrics. Additionally, GATE also performed consistently on large in-house datasets generated by extensive AlphaFold-based sampling with MULTICOM4, confirming its robustness and practical applicability in real-world model selection scenarios.</p><p><strong>Availability and implementation: </strong>GATE is available at https://github.com/BioinfoMachineLearning/GATE.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf180"},"PeriodicalIF":2.8000,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12342149/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbaf180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Estimation of protein complex structure accuracy is essential for effective structural model selection in structural biology applications such as protein function analysis and drug design. Despite the success of structure prediction methods such as AlphaFold2 and AlphaFold3, selecting top-quality structural models from large model pools remains challenging.
Results: We present GATE, a novel method that uses graph transformers on pairwise model similarity graphs to predict the quality (accuracy) of complex structural models. By integrating single-model and multimodel quality features, GATE captures intrinsic model characteristics and intermodel geometric similarities to make robust predictions. On the dataset of the 15th Critical Assessment of Protein Structure Prediction (CASP15), GATE achieved the highest Pearson's correlation (0.748) and the lowest ranking loss (0.1191) compared with existing methods. In the blind CASP16 experiment, GATE ranked fifth based on the sum of z-scores, with a Pearson's correlation of 0.7076 (first), a Spearman's correlation of 0.4514 (fourth), a ranking loss of 0.1221 (third), and an area under the curve score of 0.6680 (third) on per-target TM-score-based metrics. Additionally, GATE also performed consistently on large in-house datasets generated by extensive AlphaFold-based sampling with MULTICOM4, confirming its robustness and practical applicability in real-world model selection scenarios.
Availability and implementation: GATE is available at https://github.com/BioinfoMachineLearning/GATE.