{"title":"Topological Machine Learning for Protein-Nucleic Acid Binding Affinity Changes Upon Mutation.","authors":"Xiang Liu, Junjie Wee, Guo-Wei Wei","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Understanding how protein mutations affect protein-nucleic acid binding is critical for unraveling disease mechanisms and advancing therapies. Current experimental approaches are laborious, and computational methods remain limited in accuracy. To address this challenge, we propose a novel topological machine learning model (TopoML) combining persistent Laplacian (from topological data analysis) with multi-perspective features: physicochemical properties, topological structures, and protein Transformer-derived sequence embeddings. This integrative framework captures robust representations of protein-nucleic acid binding interactions. To validate the proposed method, we employ two datasets, a protein-DNA dataset with 596 single-point amino acid mutations, and a protein-RNA dataset with 710 single-point amino acid mutations. We show that the proposed TopoML model outperforms state-of-the-art methods in predicting mutation-induced binding affinity changes for protein-DNA and protein-RNA complexes.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12148091/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding how protein mutations affect protein-nucleic acid binding is critical for unraveling disease mechanisms and advancing therapies. Current experimental approaches are laborious, and computational methods remain limited in accuracy. To address this challenge, we propose a novel topological machine learning model (TopoML) combining persistent Laplacian (from topological data analysis) with multi-perspective features: physicochemical properties, topological structures, and protein Transformer-derived sequence embeddings. This integrative framework captures robust representations of protein-nucleic acid binding interactions. To validate the proposed method, we employ two datasets, a protein-DNA dataset with 596 single-point amino acid mutations, and a protein-RNA dataset with 710 single-point amino acid mutations. We show that the proposed TopoML model outperforms state-of-the-art methods in predicting mutation-induced binding affinity changes for protein-DNA and protein-RNA complexes.