{"title":"多类PPI预测的局部-全局图KAN","authors":"Minghui Liu, Ying Qu","doi":"10.1002/eng2.70164","DOIUrl":null,"url":null,"abstract":"<p>Traditional experimental methods for identifying protein–protein interactions (PPI) are expensive and time-consuming. Therefore, using machine learning to treat multiple PPI predictions as binary classifications has become an alternative, but there is a problem of data imbalance. The proposed GLGKAN-PPI method integrates features from both global graphs and local subgraphs to capture the complex structural information of PPI networks comprehensively. Specifically, the method utilizes the pre-trained model MASSA to extract multimodal features of proteins. The global graph features are extracted using the GKAN (Graph Kolmogorov-Arnold Network) algorithm. Meanwhile, the local subgraph features are extracted using the MOE-GKAN (Mixture of Experts-Graph Kolmogorov-Arnold Network) algorithm. To mitigate data imbalance, an asymmetric loss function is utilized to better handle minority classes and improve overall prediction accuracy. Experimental results demonstrate that GLGKAN-PPI outperforms a range of existing intelligent approaches across multiple datasets and partitioning strategies.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 5","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70164","citationCount":"0","resultStr":"{\"title\":\"A Local–Global Graph KAN for Multi-Class Prediction of PPI\",\"authors\":\"Minghui Liu, Ying Qu\",\"doi\":\"10.1002/eng2.70164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Traditional experimental methods for identifying protein–protein interactions (PPI) are expensive and time-consuming. Therefore, using machine learning to treat multiple PPI predictions as binary classifications has become an alternative, but there is a problem of data imbalance. The proposed GLGKAN-PPI method integrates features from both global graphs and local subgraphs to capture the complex structural information of PPI networks comprehensively. Specifically, the method utilizes the pre-trained model MASSA to extract multimodal features of proteins. The global graph features are extracted using the GKAN (Graph Kolmogorov-Arnold Network) algorithm. Meanwhile, the local subgraph features are extracted using the MOE-GKAN (Mixture of Experts-Graph Kolmogorov-Arnold Network) algorithm. To mitigate data imbalance, an asymmetric loss function is utilized to better handle minority classes and improve overall prediction accuracy. Experimental results demonstrate that GLGKAN-PPI outperforms a range of existing intelligent approaches across multiple datasets and partitioning strategies.</p>\",\"PeriodicalId\":72922,\"journal\":{\"name\":\"Engineering reports : open access\",\"volume\":\"7 5\",\"pages\":\"\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70164\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering reports : open access\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
传统的鉴定蛋白质-蛋白质相互作用(PPI)的实验方法既昂贵又耗时。因此,利用机器学习将多个PPI预测作为二分类来处理已经成为一种替代方案,但存在数据不平衡的问题。提出的GLGKAN-PPI方法综合了全局图和局部子图的特征,全面捕捉了PPI网络的复杂结构信息。具体而言,该方法利用预训练模型MASSA提取蛋白质的多模态特征。使用GKAN (graph Kolmogorov-Arnold Network)算法提取全局图特征。同时,采用MOE-GKAN (Mixture of Experts-Graph Kolmogorov-Arnold Network)算法提取局部子图特征。为了减轻数据不平衡,利用非对称损失函数更好地处理少数类,提高整体预测精度。实验结果表明,GLGKAN-PPI在多个数据集和分区策略上优于一系列现有的智能方法。
A Local–Global Graph KAN for Multi-Class Prediction of PPI
Traditional experimental methods for identifying protein–protein interactions (PPI) are expensive and time-consuming. Therefore, using machine learning to treat multiple PPI predictions as binary classifications has become an alternative, but there is a problem of data imbalance. The proposed GLGKAN-PPI method integrates features from both global graphs and local subgraphs to capture the complex structural information of PPI networks comprehensively. Specifically, the method utilizes the pre-trained model MASSA to extract multimodal features of proteins. The global graph features are extracted using the GKAN (Graph Kolmogorov-Arnold Network) algorithm. Meanwhile, the local subgraph features are extracted using the MOE-GKAN (Mixture of Experts-Graph Kolmogorov-Arnold Network) algorithm. To mitigate data imbalance, an asymmetric loss function is utilized to better handle minority classes and improve overall prediction accuracy. Experimental results demonstrate that GLGKAN-PPI outperforms a range of existing intelligent approaches across multiple datasets and partitioning strategies.