{"title":"Revisiting Local PageRank Estimation on Undirected Graphs: Simple and Optimal","authors":"Hanzhi Wang","doi":"arxiv-2409.08978","DOIUrl":"https://doi.org/arxiv-2409.08978","url":null,"abstract":"We propose a simple and optimal algorithm, BackMC, for local PageRank\u0000estimation in undirected graphs: given an arbitrary target node $t$ in an\u0000undirected graph $G$ comprising $n$ nodes and $m$ edges, BackMC accurately\u0000estimates the PageRank score of node $t$ while assuring a small relative error\u0000and a high success probability. The worst-case computational complexity of\u0000BackMC is upper bounded by $Oleft(frac{1}{d_{mathrm{min}}}cdot\u0000minleft(d_t, m^{1/2}right)right)$, where $d_{mathrm{min}}$ denotes the\u0000minimum degree of $G$, and $d_t$ denotes the degree of $t$, respectively.\u0000Compared to the previously best upper bound of $ Oleft(log{n}cdot\u0000minleft(d_t, m^{1/2}right)right)$ (VLDB '23), which is derived from a\u0000significantly more complex algorithm and analysis, our BackMC improves the\u0000computational complexity for this problem by a factor of\u0000$Thetaleft(frac{log{n}}{d_{mathrm{min}}}right)$ with a much simpler\u0000algorithm. Furthermore, we establish a matching lower bound of\u0000$Omegaleft(frac{1}{d_{mathrm{min}}}cdot minleft(d_t,\u0000m^{1/2}right)right)$ for any algorithm that attempts to solve the problem of\u0000local PageRank estimation, demonstrating the theoretical optimality of our\u0000BackMC. We conduct extensive experiments on various large-scale real-world and\u0000synthetic graphs, where BackMC consistently shows superior performance.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jordan Dempsey, Leo van Iersel, Mark Jones, Norbert Zeh
{"title":"A Simple 4-Approximation Algorithm for Maximum Agreement Forests on Multiple Unrooted Binary Trees","authors":"Jordan Dempsey, Leo van Iersel, Mark Jones, Norbert Zeh","doi":"arxiv-2409.08440","DOIUrl":"https://doi.org/arxiv-2409.08440","url":null,"abstract":"We present a simple 4-approximation algorithm for computing a maximum\u0000agreement forest of multiple unrooted binary trees. This algorithm applies LP\u0000rounding to an extension of a recent ILP formulation of the maximum agreement\u0000forest problem on two trees by Van Wersch al. We achieve the same approximation\u0000ratio as the algorithm of Chen et al. but our algorithm is extremely simple. We\u0000also prove that no algorithm based on the ILP formulation by Van Wersch et al.\u0000can achieve an approximation ratio of $4 - varepsilon$, for any $varepsilon >\u00000$, even on two trees. To this end, we prove that the integrality gap of the\u0000ILP approaches 4 as the size of the two input trees grows.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142263334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fine-Grained Complexity of Multiple Domination and Dominating Patterns in Sparse Graphs","authors":"Marvin Künnemann, Mirza Redzic","doi":"arxiv-2409.08037","DOIUrl":"https://doi.org/arxiv-2409.08037","url":null,"abstract":"The study of domination in graphs has led to a variety of domination problems\u0000studied in the literature. Most of these follow the following general\u0000framework: Given a graph $G$ and an integer $k$, decide if there is a set $S$\u0000of $k$ vertices such that (1) some inner property $phi(S)$ (e.g.,\u0000connectedness) is satisfied, and (2) each vertex $v$ satisfies some domination\u0000property $rho(S, v)$ (e.g., there is an $sin S$ that is adjacent to $v$). Since many real-world graphs are sparse, we seek to determine the optimal\u0000running time of such problems in both the number $n$ of vertices and the number\u0000$m$ of edges in $G$. While the classic dominating set problem admits a rather\u0000limited improvement in sparse graphs (Fischer, K\"unnemann, Redzic SODA'24), we\u0000show that natural variants studied in the literature admit much larger\u0000speed-ups, with a diverse set of possible running times. Specifically, we\u0000obtain conditionally optimal algorithms for: 1) $r$-Multiple $k$-Dominating Set (each vertex must be adjacent to at least\u0000$r$ vertices in $S$): If $rle k-2$, we obtain a running time of $(m/n)^{r}\u0000n^{k-r+o(1)}$ that is conditionally optimal assuming the 3-uniform hyperclique\u0000hypothesis. In sparse graphs, this fully interpolates between $n^{k-1pm o(1)}$\u0000and $n^{2pm o(1)}$, depending on $r$. Curiously, when $r=k-1$, we obtain a\u0000randomized algorithm beating $(m/n)^{k-1} n^{1+o(1)}$ and we show that this\u0000algorithm is close to optimal under the $k$-clique hypothesis. 2) $H$-Dominating Set ($S$ must induce a pattern $H$). We conditionally\u0000settle the complexity of three such problems: (a) Dominating Clique ($H$ is a\u0000$k$-clique), (b) Maximal Independent Set of size $k$ ($H$ is an independent set\u0000on $k$ vertices), (c) Dominating Induced Matching ($H$ is a perfect matching on\u0000$k$ vertices).","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computing the LZ-End parsing: Easy to implement and practically efficient","authors":"Patrick Dinklage","doi":"arxiv-2409.07840","DOIUrl":"https://doi.org/arxiv-2409.07840","url":null,"abstract":"The LZ-End parsing [Kreft & Navarro, 2011] of an input string yields\u0000compression competitive with the popular Lempel-Ziv 77 scheme, but also allows\u0000for efficient random access. Kempa and Kosolobov showed that the parsing can be\u0000computed in time and space linear in the input length [Kempa & Kosolobov,\u00002017], however, the corresponding algorithm is hardly practical. We put the\u0000spotlight on their suboptimal algorithm that computes the parsing in time\u0000$mathcal{O}(n lglg n)$. It requires a comparatively small toolset and is\u0000therefore easy to implement, but at the same time very efficient in practice.\u0000We give a detailed and simplified description with a full listing that\u0000incorporates undocumented tricks from the original implementation, but also\u0000uses lazy evaluation to reduce the workload in practice and requires less\u0000working memory by removing a level of indirection. We legitimize our algorithm\u0000in a brief benchmark, obtaining the parsing faster than the state of the art.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hung Le, Shay Solomon, Cuong Than, Csaba D. Tóth, Tianyi Zhang
{"title":"Towards Instance-Optimal Euclidean Spanners","authors":"Hung Le, Shay Solomon, Cuong Than, Csaba D. Tóth, Tianyi Zhang","doi":"arxiv-2409.08227","DOIUrl":"https://doi.org/arxiv-2409.08227","url":null,"abstract":"Euclidean spanners are important geometric objects that have been extensively\u0000studied since the 1980s. The two most basic \"compactness'' measures of a\u0000Euclidean spanner $E$ are the size (number of edges) $|E|$ and the weight (sum\u0000of edge weights) $|E|$. In this paper, we initiate the study of instance\u0000optimal Euclidean spanners. Our results are two-fold. We demonstrate that the greedy spanner is far from being instance optimal,\u0000even when allowing its stretch to grow. More concretely, we design two hard\u0000instances of point sets in the plane, where the greedy $(1+x epsilon)$-spanner\u0000(for basically any parameter $x geq 1$) has $Omega_x(epsilon^{-1/2}) cdot\u0000|E_mathrm{spa}|$ edges and weight $Omega_x(epsilon^{-1}) cdot\u0000|E_mathrm{light}|$, where $E_mathrm{spa}$ and $E_mathrm{light}$ denote the\u0000per-instance sparsest and lightest $(1+epsilon)$-spanners, respectively, and\u0000the $Omega_x$ notation suppresses a polynomial dependence on $1/x$. As our main contribution, we design a new construction of Euclidean spanners,\u0000which is inherently different from known constructions, achieving the following\u0000bounds: a stretch of $1+epsiloncdot 2^{O(log^*(d/epsilon))}$ with $O(1)\u0000cdot |E_mathrm{spa}|$ edges and weight $O(1) cdot |E_mathrm{light}|$. In\u0000other words, we show that a slight increase to the stretch suffices for\u0000obtaining instance optimality up to an absolute constant for both sparsity and\u0000lightness. Remarkably, there is only a log-star dependence on the dimension in\u0000the stretch, and there is no dependence on it whatsoever in the number of edges\u0000and weight.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"106 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Static Pricing for Single Sample Multi-unit Prophet Inequalities","authors":"Pranav Nuti, Peter Westbrook","doi":"arxiv-2409.07719","DOIUrl":"https://doi.org/arxiv-2409.07719","url":null,"abstract":"In this paper, we study $k$-unit single sample prophet inequalities. A seller\u0000has $k$ identical, indivisible items to sell. A sequence of buyers arrive\u0000one-by-one, with each buyer's private value for the item, $X_i$, revealed to\u0000the seller when they arrive. While the seller is unaware of the distribution\u0000from which $X_i$ is drawn, they have access to a single sample, $Y_i$ drawn\u0000from the same distribution as $X_i$. What strategies can the seller adopt so as\u0000to maximize social welfare? Previous work has demonstrated that when $k = 1$, if the seller sets a price\u0000equal to the maximum of the samples, they can achieve a competitive ratio of\u0000$frac{1}{2}$ of the social welfare, and recently Pashkovich and Sayutina\u0000established an analogous result for $k = 2$. In this paper, we prove that for\u0000$k geq 3$, setting a (static) price equal to the $k^{text{th}}$ largest\u0000sample also obtains a competitive ratio of $frac{1}{2}$, resolving a\u0000conjecture Pashkovich and Sayutina pose. We then consider the situation where $k$ is large. We demonstrate that\u0000setting a price equal to the $(k-sqrt{2klog k})^{text{th}}$ largest sample\u0000obtains a competitive ratio of $1 - sqrt{frac{2log k}{k}} -\u0000oleft(sqrt{frac{log k}{k}}right)$, and that this is the optimal possible\u0000ratio achievable with a static pricing scheme that sets one of the samples as a\u0000price.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"9 41 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142227801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum And- vs. Even-SAT","authors":"Tamio-Vesa Nakajima, Stanislav Živný","doi":"arxiv-2409.07837","DOIUrl":"https://doi.org/arxiv-2409.07837","url":null,"abstract":"A (multi)set of literals, called a clause, is strongly satisfied by an\u0000assignment if no literal evaluates to false. Finding an assignment that\u0000maximises the number of strongly satisfied clauses is NP-hard. We present a\u0000simple algorithm that finds, given a set of clauses that admits an assignment\u0000that strongly satisfies a $rho$-fraction of the clauses, an assignment in\u0000which at least a $rho$-fraction of the clauses is weakly satisfied, in the\u0000sense that an even number of literals evaluates to false. In particular, this\u0000implies an efficient algorithm for finding an undirected cut of value $rho$ in\u0000a graph given that a directed cut of value $rho$ in the graph is promised to\u0000exist.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Optimal Algorithm for Sorting Pattern-Avoiding Sequences","authors":"Michal Opler","doi":"arxiv-2409.07868","DOIUrl":"https://doi.org/arxiv-2409.07868","url":null,"abstract":"We present a deterministic comparison-based algorithm that sorts sequences\u0000avoiding a fixed permutation $pi$ in linear time, even if $pi$ is a priori\u0000unkown. Moreover, the dependence of the multiplicative constant on the pattern\u0000$pi$ matches the information-theoretic lower bound. A crucial ingredient is an\u0000algorithm for performing efficient multi-way merge based on the Marcus-Tardos\u0000theorem. As a direct corollary, we obtain a linear-time algorithm for sorting\u0000permutations of bounded twin-width.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"402 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Martin Aumüller, Fabrizio Boninsegna, Francesco Silvestri
{"title":"A Simple Linear Space Data Structure for ANN with Application in Differential Privacy","authors":"Martin Aumüller, Fabrizio Boninsegna, Francesco Silvestri","doi":"arxiv-2409.07187","DOIUrl":"https://doi.org/arxiv-2409.07187","url":null,"abstract":"Locality Sensitive Filters are known for offering a quasi-linear space data\u0000structure with rigorous guarantees for the Approximate Near Neighbor search\u0000problem. Building on Locality Sensitive Filters, we derive a simple data\u0000structure for the Approximate Near Neighbor Counting problem under differential\u0000privacy. Moreover, we provide a simple analysis leveraging a connection with\u0000concomitant statistics and extreme value theory. Our approach achieves the same\u0000performance as the recent findings of Andoni et al. (NeurIPS 2023) but with a\u0000more straightforward method. As a side result, the paper provides a more\u0000compact description and analysis of Locality Sensitive Filters for Approximate\u0000Near Neighbor Search under inner product similarity, improving a previous\u0000result in Aum\"{u}ller et al. (TODS 2022).","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Blaž Pšeničnik, Rene Mlinarič, Janez Brest, Borko Bošković
{"title":"Dual-Step Optimization for Binary Sequences with High Merit Factors","authors":"Blaž Pšeničnik, Rene Mlinarič, Janez Brest, Borko Bošković","doi":"arxiv-2409.07222","DOIUrl":"https://doi.org/arxiv-2409.07222","url":null,"abstract":"The problem of finding aperiodic low auto-correlation binary sequences (LABS)\u0000presents a significant computational challenge, particularly as the sequence\u0000length increases. Such sequences have important applications in communication\u0000engineering, physics, chemistry, and cryptography. This paper introduces a\u0000novel dual-step algorithm for long binary sequences with high merit factors.\u0000The first step employs a parallel algorithm utilizing skew-symmetry and\u0000restriction classes to generate sequence candidates with merit factors above a\u0000predefined threshold. The second step uses a priority queue algorithm to refine\u0000these candidates further, searching the entire search space unrestrictedly. By\u0000combining GPU-based parallel computing and dual-step optimization, our approach\u0000has successfully identified new best-known binary sequences for all lengths\u0000ranging from 450 to 527, with the exception of length 518, where the previous\u0000best-known value was matched with a different sequence. This hybrid method\u0000significantly outperforms traditional exhaustive and stochastic search methods,\u0000offering an efficient solution for finding long sequences with good merit\u0000factors.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}