A random walk based approach for improving protein-protein interaction network and protein complex prediction

Chengwei Lei, Jianhua Ruan
{"title":"A random walk based approach for improving protein-protein interaction network and protein complex prediction","authors":"Chengwei Lei, Jianhua Ruan","doi":"10.1109/BIBM.2012.6392693","DOIUrl":null,"url":null,"abstract":"Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2012.6392693","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Recent advances in high-throughput technology have dramatically increased the quantity of available protein-protein interaction (PPI) data and stimulated the development of many methods for predicting protein complexes, which are important in understanding the functional organization of protein-protein interaction networks in different biological processes. However, automated protein complex prediction from PPI data alone is significantly hindered by the high level of noise, sparseness, and highly skewed degree distribution of PPI networks. Here we present a novel network topology-based algorithm to remove spurious interactions and recover missing ones by computational predictions, and to increase the accuracy of protein complex prediction by reducing the impact of hub nodes. The key idea of our algorithm is that two proteins sharing some high-order topological similarities, which are measured by a novel random walk-based procedure, are likely interacting with each other and may belong to the same protein complex. Applying our algorithm to a yeast protein-protein interaction network, we found that the interactions in the reconstructed PPI network have more significant biological relevance than the original network, assessed by multiple types of information, including gene ontology, gene expression, essentiality, conservation between species, and known protein complexes. Comparison with several existing methods show that the network reconstructed by our method has the highest quality. Finally, using two independent graph clustering algorithms, we found that the reconstructed network has resulted in significantly improved prediction accuracy of protein complexes.
一种改进蛋白质相互作用网络和蛋白质复合物预测的随机漫步方法
高通量技术的最新进展极大地增加了可用的蛋白质-蛋白质相互作用(PPI)数据的数量,并刺激了许多预测蛋白质复合物方法的发展,这对于理解不同生物过程中蛋白质-蛋白质相互作用网络的功能组织非常重要。然而,仅从PPI数据自动预测蛋白质复合物会受到PPI网络的高噪声、稀疏性和高度偏斜度分布的严重阻碍。本文提出了一种新的基于网络拓扑的算法,通过计算预测来去除虚假相互作用并恢复缺失的相互作用,并通过减少集线器节点的影响来提高蛋白质复合体预测的准确性。我们的算法的关键思想是,两个蛋白质共享一些高阶拓扑相似性,这是由一种新的基于随机行走的程序来测量的,它们可能相互作用,可能属于同一个蛋白质复合物。将我们的算法应用于酵母蛋白-蛋白相互作用网络,我们发现重建的PPI网络中的相互作用比原始网络具有更显著的生物学相关性,包括基因本体、基因表达、必要性、物种之间的保守性和已知的蛋白质复合物等多种类型的信息。与现有几种方法的比较表明,本文方法重构的网络具有较高的质量。最后,使用两种独立的图聚类算法,我们发现重建的网络显著提高了蛋白质复合物的预测精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信