Internet Mathematics最新文献_第9页

Multiscale Matrix Sampling and Sublinear-Time PageRank Computation 多尺度矩阵采样与次线性时间PageRank计算

Internet Mathematics Pub Date : 2012-02-13 DOI: 10.1080/15427951.2013.802752

C. Borgs, Mickey Brautbar, J. Chayes, S. Teng

{"title":"Multiscale Matrix Sampling and Sublinear-Time PageRank Computation","authors":"C. Borgs, Mickey Brautbar, J. Chayes, S. Teng","doi":"10.1080/15427951.2013.802752","DOIUrl":"https://doi.org/10.1080/15427951.2013.802752","url":null,"abstract":"Abstract A fundamental problem arising in many applications in Web science and social network analysis is the problem of identifying all nodes in a network whose PageRank exceeds a given threshold Δ. In this paper, we study the probabilistic version of the problem whereby given an arbitrary approximation factor c > 1, we are asked to output a set S of nodes such that with high probability, S contains all nodes of PageRank at least Δ, and no node of PageRank smaller than Δ/c. We call this problem SignificantPageRanks. We develop a nearly optimal local algorithm for the problem with time complexity on networks with n nodes, where the tilde hides a polylogarithmic factor. We show that every algorithm for solving this problem must have running time of Ω(n/Δ), rendering our algorithm optimal up to logarithmic factors. Our algorithm has sublinear time complexity for applications including Web crawling and Web search that require efficient identification of nodes whose PageRanks are above a threshold Δ = nδ, for some constant 0 < δ < 1. Our algorithm comes with two main technical contributions. The first is a multiscale sampling scheme for a basic matrix problem that could be of interest on its own. For us, it appears as an abstraction of a subproblem we need to tackle in order to solve the SignificantPageRanks problem, but we hope that this abstraction will be useful in designing fast algorithms for identifying nodes that are significant beyond PageRank measurements. In the abstract matrix problem, it is assumed that one can access an unknown right-stochastic matrix by querying its rows, where the cost of a query and the accuracy of the answers depend on a precision parameter ε. At a cost propositional to 1/ε, the query will return a list of O(1/ε) entries and their indices that provide an ε-precision approximation of the row. Our task is to find a set that contains all columns whose sum is at least Δ and omits every column whose sum is less than Δ/c. Our multiscale sampling scheme solves this problem with cost , while traditional sampling algorithms would take time Θ((n/Δ)2). Our second main technical contribution is a new local algorithm for approximating personalized PageRank, which is more robust than the earlier ones developed in [Jeh and Widom 03, Andersen et al. 06] and is highly efficient, particularly for networks with large in-degrees or out-degrees. Together with our multiscale sampling scheme, we are able to solve the SignificantPageRanks problem optimally.","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"10 1","pages":"20 - 48"},"PeriodicalIF":0.0,"publicationDate":"2012-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2013.802752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59947534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

On the Hyperbolicity of Small-World and Treelike Random Graphs 关于小世界树状随机图的双曲性

Internet Mathematics Pub Date : 2012-01-09 DOI: 10.1080/15427951.2013.828336

Wei Chen, Wenjie Fang, Guangda Hu, Michael W. Mahoney

{"title":"On the Hyperbolicity of Small-World and Treelike Random Graphs","authors":"Wei Chen, Wenjie Fang, Guangda Hu, Michael W. Mahoney","doi":"10.1080/15427951.2013.828336","DOIUrl":"https://doi.org/10.1080/15427951.2013.828336","url":null,"abstract":"Hyperbolicity is a property of a graph that may be viewed as a “soft” version of a tree, and recent empirical and theoretical work has suggested that many graphs arising in Internet and related data applications have hyperbolic properties. Here we consider Gromov's notion of δ-hyperbolicity and establish several positive and negative results for small-world and treelike random graph models. First, we study the hyperbolicity of the class of Kleinberg small-world random graphs , where n is the number of vertices in the graph, d is the dimension of the underlying base grid B, and γ is the small-world parameter such that each node u in the graph connects to another node v in the graph with probability proportional to 1/dB (u, v)γ, with dB (u, v) the grid distance from u to v in the base grid B. We show that when γ=d, the parameter value allowing efficient decentralized routing in Kleinberg's small-world network,the hyperbolic δ is with probability 1−o(1) for every ϵ>0 independent of n. We see that hyperbolicity is not significantly improved in relation to graph diameter even when the long-range connections greatly improve decentralized navigation. We also show that for other values of γ, the hyperbolic δ is very close to the graph diameter, indicating poor hyperbolicity in these graphs as well. Next we study a class of treelike graphs called ringed trees that have constant hyperbolicity. We show that adding random links among the leaves in a manner similar to the small-world graph constructions may easily destroy the hyperbolicity of the graphs, except for a class of random edges added using an exponentially decaying probability function based on the ring distance among the leaves. Our study provides one of the first significant analytic results on the hyperbolicity of a rich class of random graphs, which sheds light on the relationship between hyperbolicity and navigability of random graphs, as well as on the sensitivity of hyperbolic δ to noises in random graphs.","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"9 1","pages":"434 - 491"},"PeriodicalIF":0.0,"publicationDate":"2012-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2013.828336","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59947635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 82

Editorial Board EOV 编辑委员会EOV

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.630923

引用次数: 0

Extension and Robustness of Transitivity Clustering for Protein–Protein Interaction Network Analysis 传递性聚类在蛋白质-蛋白质相互作用网络分析中的可拓性和鲁棒性

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604559

T. Wittkop, S. Rahmann, Richard Röttger, Sebastian Böcker, J. Baumbach

{"title":"Extension and Robustness of Transitivity Clustering for Protein–Protein Interaction Network Analysis","authors":"T. Wittkop, S. Rahmann, Richard Röttger, Sebastian Böcker, J. Baumbach","doi":"10.1080/15427951.2011.604559","DOIUrl":"https://doi.org/10.1080/15427951.2011.604559","url":null,"abstract":"Abstract Partitioning biological data objects into groups such that the objects within the groups share common traits is a longstanding challenge in computational biology. Recently, we developed and established transitivity clustering, a partitioning approach based on weighted transitive graph projection that utilizes a single similarity threshold as density parameter. In previous publications, we concentrated on the graphical user interface and on concrete biomedical application protocols. Here, we contribute the following theoretical considerations: (1) We provide proofs that the average similarity between objects from the same cluster is above the user-given threshold and that the average similarity between objects from different clusters is below the threshold. (2) We extend transitivity clustering to an overlapping clustering tool by integrating two new approaches. (3) We demonstrate the power of transitivity clustering for protein-complex detection. We evaluate our approaches against others by utilizing gold-standard data that was previously used by Brohée et al. for reviewing existing bioinformatics clustering tools. The extended version of this article is available online at http://transclust.mpi-inf.mpg.de .","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"7 1","pages":"255 - 273"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2011.604559","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59946736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Using Biological Networks in Protein Function Prediction and Gene Expression Analysis 生物网络在蛋白质功能预测和基因表达分析中的应用

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604561

L. Wong

引用次数: 3

KeyPathwayMiner: Detecting Case-Specific Biological Pathways Using Expression Data KeyPathwayMiner:使用表达数据检测特定病例的生物学途径

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604548

N. Alcaraz, Hande Küçük, Jochen Weile, A. Wipat, J. Baumbach

{"title":"KeyPathwayMiner: Detecting Case-Specific Biological Pathways Using Expression Data","authors":"N. Alcaraz, Hande Küçük, Jochen Weile, A. Wipat, J. Baumbach","doi":"10.1080/15427951.2011.604548","DOIUrl":"https://doi.org/10.1080/15427951.2011.604548","url":null,"abstract":"Abstract Recent advances in systems biology have provided us with massive amounts of pathway data that describe the interplay of genes and their products. The resulting biological networks can be modeled as graphs. By means of “omics” technologies, such as microarrays, the activity of genes and proteins can be measured. Here, data from microarray experiments is integrated with the network data to gain deeper insights into gene expression. We introduce KeyPathwayMiner, a method that enables the extraction and visualization of interesting subpathways given the results of a series of gene expression studies. We aim to detect highly connected subnetworks in which most genes or proteins show similar patterns of expression. Specifically, given network and gene expression data, KeyPathwayMiner identifies those maximal subgraphs where all but k nodes of the subnetwork are expressed similarly in all but l cases in the gene expression data. Since identifying these subgraphs is computationally intensive, we developed a heuristic algorithm based on Ant Colony Optimization. We implemented KeyPathwayMiner as a plug-in for Cytoscape. Our computational model is related to a strategy presented by Ulitsky et al. in 2008. Consequently, we used the same data sets for evaluation. KeyPathwayMiner is available online at http://keypathwayminer.mpi-inf.mpg.de .","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"7 1","pages":"299 - 313"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2011.604548","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59946688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

Googling the Brain: Discovering Hierarchical and Asymmetric Network Structures, with Applications in Neuroscience 谷歌搜索大脑:发现层次和不对称网络结构，在神经科学中的应用

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604284

J. J. Crofts, D. Higham

{"title":"Googling the Brain: Discovering Hierarchical and Asymmetric Network Structures, with Applications in Neuroscience","authors":"J. J. Crofts, D. Higham","doi":"10.1080/15427951.2011.604284","DOIUrl":"https://doi.org/10.1080/15427951.2011.604284","url":null,"abstract":"Abstract Hierarchical organization is a common feature of many directed networks arising in nature and technology. For example, a well-defined message-passing framework based on managerial status typically exists in a business organization. However, in many real-world networks, such patterns of hierarchy are unlikely to be quite so transparent. Due to the nature in which empirical data are collated, the nodes will often be ordered so as to obscure any underlying structure. In addition, the possibility of even a small number of links violating any overall “chain of command” makes the determination of such structures extremely challenging. Here we address the issue of how to reorder a directed network to reveal this type of hierarchy. In doing so, we also look at the task of quantifying the level of hierarchy, given a particular node ordering. We look at a variety of approaches. Using ideas from the graph Laplacian literature, we show that a relevant discrete optimization problem leads to a natural hierarchical node ranking. We also show that this ranking arises via a maximum likelihood problem associated with a new range-dependent hierarchical random-graph model. This random-graph insight allows us to compute a likelihood ratio that quantifies the overall tendency for a given network to be hierarchical. We also develop a generalization of this node-ordering algorithm based on the combinatorics of directed walks. In passing, we note that Google's PageRank algorithm tackles a closely related problem, and may also be motivated from a combinatoric, walk-counting viewpoint. We illustrate the performance of the resulting algorithms on synthetic network data, and on a real-world network from neuroscience where results may be validated biologically.","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"7 1","pages":"233 - 254"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2011.604284","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59946641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 27

On the Approximability of Reachability-Preserving Network Orientations 关于保持可达网络方向的逼近性

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604554

Michael Elberfeld, V. Bafna, Iftah Gamzu, Alexander Medvedovsky, D. Segev, Dana Silverbush, Uri Zwick, R. Sharan

引用次数: 6

Introduction to the Special Issue on Biological Networks 生物网络特刊导论

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.621769

Natasa Przulj

{"title":"Introduction to the Special Issue on Biological Networks","authors":"Natasa Przulj","doi":"10.1080/15427951.2011.621769","DOIUrl":"https://doi.org/10.1080/15427951.2011.621769","url":null,"abstract":"In this special issue on biological networks, we aim to interest the readership of Internet Mathematics in network theory applied to bioinformatics. Network biology is a new and emerging research area that is fast-growing, spurred by the collection of biological data representing connections or interactions of molecules in the cell. As such, it has the potential to have at least as profound an impact on our understanding of the cell as sequence data has had. However, the datasets are large, noisy and many graph theoretic problems are formally intractable (impossible to solve exactly in any time less than the age of the universe), and so heuristic approximations must be developed in an attempt to find approximate solutions. Furthermore, the tools developed to solve these problems must be made accessible to biological practitioners. In this direction, this issue contains papers on the many databases available, theoretical and algorithmic advances in analyzing these data, as well as papers on some specific biomedical applications, and two papers introducing software tools. This issue presents six papers from some of the leading research groups in the area. Three papers present significant theoretical advances in techniques. Two of them (Elberfeld et al.; Crofts and Higham) look at directed graphs. First, Elberfeld et al. attack the “maximum graph orientation problem”, in which, given a list of source-sink pairs of nodes, we attempt to add direction to an undirected graph in such a way as to maximize the number of pairs for which directed paths exist from the source to the sink. This has applications in the problem of learning biological pathways, but Elberfeld et al. show that the problem is NP-hard","PeriodicalId":38105,"journal":{"name":"Internet Mathematics","volume":"7 1","pages":"207 - 208"},"PeriodicalIF":0.0,"publicationDate":"2011-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15427951.2011.621769","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59946781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

NAViGaTOR: Large Scalable and Interactive Navigation and Analysis of Large Graphs NAViGaTOR:大型可伸缩和交互式导航以及大型图形的分析

Internet Mathematics Pub Date : 2011-11-28 DOI: 10.1080/15427951.2011.604289

A. Djebbari, Muhammad Ali, D. Otasek, M. Kotlyar, Kristen Fortney, Serene W. H. Wong, A. Hrvojic, I. Jurisica

引用次数: 14