Massimo Equi, Veli Mäkinen, Alexandru I. Tomescu, Roberto Grossi
{"title":"On the Complexity of String Matching for Graphs","authors":"Massimo Equi, Veli Mäkinen, Alexandru I. Tomescu, Roberto Grossi","doi":"https://dl.acm.org/doi/10.1145/3588334","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3588334","url":null,"abstract":"<p>Exact string matching in labeled graphs is the problem of searching paths of a graph <i>G=(V, E)</i> such that the concatenation of their node labels is equal to a given pattern string <i>P</i>[1.<i>m</i>]. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, of query operations in graph databases, and of analysis operations in heterogeneous networks.</p><p>We prove a conditional lower bound stating that, for any constant ε > 0, an <i>O</i>(|<i>E</i>|<sup>1 - ε</sup> <i>m</i>) time, or an <i>O</i>(|<i>E</i>| <i>m</i><sup>1 - ε</sup>)time algorithm for exact string matching in graphs, with node labels and pattern drawn from a binary alphabet, cannot be achieved unless the Strong Exponential Time Hypothesis (<sans-serif>SETH</sans-serif>) is false. This holds even if restricted to undirected graphs with maximum node degree 2—that is, to <i>zig-zag matching in bidirectional strings</i>, or to <i>deterministic</i> directed acyclic graphs whose nodes have maximum sum of indegree and outdegree 3. These restricted cases make the lower bound stricter than what can be directly derived from related bounds on regular expression matching (Backurs and Indyk, FOCS’16). In fact, our bounds are tight in the sense that lowering the degree or the alphabet size yields linear time solvable problems.</p><p>An interesting corollary is that exact and approximate matching are equally hard (i.e., quadratic time) in graphs under <sans-serif>SETH</sans-serif>. In comparison, the same problems restricted to strings have linear time vs quadratic time solutions, respectively (approximate pattern matching having also a matching <sans-serif>SETH</sans-serif> lower bound (Backurs and Indyk, STOC’15)).</p><p></p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hopcroft’s Problem, Log-Star Shaving, 2D Fractional Cascading, and Decision Trees","authors":"Timothy M. Chan, Da Wei Zheng","doi":"10.1145/3591357","DOIUrl":"https://doi.org/10.1145/3591357","url":null,"abstract":"We revisit Hopcroft’s problem and related fundamental problems about geometric range searching. Given n points and n lines in the plane, we show how to count the number of point-line incidence pairs or the number of point-above-line pairs in O ( n 4/3 ) time, which matches the conjectured lower bound and improves the best previous time bound of (n^{4/3}2^{O(log ^*n)} ) obtained almost 30 years ago by Matoušek. We describe two interesting and different ways to achieve the result: the first is randomized and uses a new 2D version of fractional cascading for arrangements of lines; the second is deterministic and uses decision trees in a manner inspired by the sorting technique of Fredman (1976). The second approach extends to any constant dimension. Many consequences follow from these new ideas: for example, we obtain an O ( n 4/3 )-time algorithm for line segment intersection counting in the plane, O ( n 4/3 )-time randomized algorithms for distance selection in the plane and bichromatic closest pair and Euclidean minimum spanning tree in three or four dimensions, and a randomized data structure for halfplane range counting in the plane with O ( n 4/3 ) preprocessing time and space and O ( n 1/3 ) query time.","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134994109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hopcroft’s Problem, Log-Star Shaving, 2D Fractional Cascading, and Decision Trees","authors":"Timothy M. Chan, Da Wei Zheng","doi":"https://dl.acm.org/doi/10.1145/3591357","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3591357","url":null,"abstract":"<p>We revisit Hopcroft’s problem and related fundamental problems about geometric range searching. Given <i>n</i> points and <i>n</i> lines in the plane, we show how to count the number of point-line incidence pairs or the number of point-above-line pairs in <i>O</i>(<i>n</i><sup>4/3</sup>) time, which matches the conjectured lower bound and improves the best previous time bound of (n^{4/3}2^{O(log ^*n)} ) obtained almost 30 years ago by Matoušek. </p><p>We describe two interesting and different ways to achieve the result: the first is randomized and uses a new 2D version of fractional cascading for arrangements of lines; the second is deterministic and uses decision trees in a manner inspired by the sorting technique of Fredman (1976). The second approach extends to any constant dimension. </p><p>Many consequences follow from these new ideas: for example, we obtain an <i>O</i>(<i>n</i><sup>4/3</sup>)-time algorithm for line segment intersection counting in the plane, <i>O</i>(<i>n</i><sup>4/3</sup>)-time randomized algorithms for distance selection in the plane and bichromatic closest pair and Euclidean minimum spanning tree in three or four dimensions, and a randomized data structure for halfplane range counting in the plane with <i>O</i>(<i>n</i><sup>4/3</sup>) preprocessing time and space and <i>O</i>(<i>n</i><sup>1/3</sup>) query time.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 2","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Greedy Spanners in Euclidean Spaces Admit Sublinear Separators","authors":"Hung Le, Cuong Than","doi":"https://dl.acm.org/doi/10.1145/3590771","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3590771","url":null,"abstract":"<p>The greedy spanner in a low dimensional Euclidean space is a fundamental geometric construction that has been extensively studied over three decades as it possesses the two most basic properties of a good spanner: constant maximum degree and constant lightness. Recently, Eppstein and Khodabandeh [28] showed that the greedy spanner in (mathbb {R}^2 ) admits a sublinear separator in a strong sense: any subgraph of <i>k</i> vertices of the greedy spanner in (mathbb {R}^2 ) has a separator of size (O(sqrt {k}) ). Their technique is inherently planar and is not extensible to higher dimensions. They left showing the existence of a small separator for the greedy spanner in (mathbb {R}^d ) for any constant <i>d</i> ≥ 3 as an open problem. </p><p>In this paper, we resolve the problem of Eppstein and Khodabandeh [28] by showing that any subgraph of <i>k</i> vertices of the greedy spanner in (mathbb {R}^d ) has a separator of size <i>O</i>(<i>k</i><sup>1 − 1/<i>d</i></sup>). We introduce a new technique that gives a simple criterion for any geometric graph to have a sublinear separator that we dub <i><i>τ</i>-lanky</i>: a geometric graph is <i>τ</i>-lanky if any ball of radius <i>r</i> cuts at most <i>τ</i> edges of length at least <i>r</i> in the graph. We show that any <i>τ</i>-lanky geometric graph of <i>n</i> vertices in (mathbb {R}^d ) has a separator of size <i>O</i>(<i>τn</i><sup>1 − 1/<i>d</i></sup>). We then derive our main result by showing that the greedy spanner is <i>O</i>(1)-lanky. We indeed obtain a more general result that applies to unit ball graphs and point sets of low fractal dimensions in (mathbb {R}^d ). </p><p>Our technique naturally extends to doubling metrics. We use the <i>τ</i>-lanky criterion to show that there exists a (1 + ϵ)-spanner for doubling metrics of dimension <i>d</i> with a constant maximum degree and a separator of size (O(n^{1-frac{1}{d}}) ); this result resolves an open problem posed by Abam and Har-Peled [1] a decade ago. We then introduce another simple criterion for a graph in doubling metrics of dimension <i>d</i> to have a sublinear separator. We use the new criterion to show that the greedy spanner of an <i>n</i>-point metric space of doubling dimension <i>d</i> has a separator of size (O((n^{1-frac{1}{d}}) + log Delta) ) where <i>Δ</i> is the spread of the metric; the factor log (<i>Δ</i>) is tightly connected to the fact that, unlike its Euclidean counterpart, the greedy spanner in doubling metrics has <i>unbounded maximum degree</i>. Finally, we discuss algorithmic implications of our results.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"16 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138516998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
S. Bouchard, Yoann Dieudonné, Arnaud Labourel, A. Pelc
{"title":"Almost-Optimal Deterministic Treasure Hunt in Unweighted Graphs","authors":"S. Bouchard, Yoann Dieudonné, Arnaud Labourel, A. Pelc","doi":"10.1145/3588437","DOIUrl":"https://doi.org/10.1145/3588437","url":null,"abstract":"A mobile agent navigating along edges of a simple connected unweighted graph, either finite or countably infinite, has to find an inert target (treasure) hidden in one of the nodes. This task is known as treasure hunt. The agent has no a priori knowledge of the graph, of the location of the treasure, or of the initial distance to it. The cost of a treasure hunt algorithm is the worst-case number of edge traversals performed by the agent until finding the treasure. Awerbuch et al. [3] considered graph exploration and treasure hunt for finite graphs in a restricted model where the agent has a fuel tank that can be replenished only at the starting node s. The size of the tank is B = 2 (1+α) r, for some positive real constant α, where r, called the radius of the graph, is the maximum distance from s to any other node. The tank of size B allows the agent to make at most ⌊ B ⌋ edge traversals between two consecutive visits at node s. Let e(d) be the number of edges whose at least one endpoint is at distance less than d from s. Awerbuch et al. [3] conjectured that it is impossible to find a treasure hidden in a node at distance at most d at cost nearly linear in e(d). We first design a deterministic treasure hunt algorithm working in the model without any restrictions on the moves of the agent at cost 𝒪(e(d) log d) and then show how to modify this algorithm to work in the model from Awerbuch et al. [3] with the same complexity. Thus, we refute the preceding 20-year-old conjecture. We observe that no treasure hunt algorithm can beat cost Θ (e(d)) for all graphs, and thus our algorithms are also almost optimal.","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"19 1","pages":"1 - 32"},"PeriodicalIF":1.3,"publicationDate":"2023-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46965011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Massimo Equi, R. Grossi, V. Mäkinen, Alexandru I. Tomescu
{"title":"On the Complexity of String Matching for Graphs","authors":"Massimo Equi, R. Grossi, V. Mäkinen, Alexandru I. Tomescu","doi":"10.1145/3588334","DOIUrl":"https://doi.org/10.1145/3588334","url":null,"abstract":"Exact string matching in labeled graphs is the problem of searching paths of a graph G=(V, E) such that the concatenation of their node labels is equal to a given pattern string P[1.m]. This basic problem can be found at the heart of more complex operations on variation graphs in computational biology, of query operations in graph databases, and of analysis operations in heterogeneous networks. We prove a conditional lower bound stating that, for any constant ε > 0, an O(|E|1 - ε m) time, or an O(|E| m1 - ε)time algorithm for exact string matching in graphs, with node labels and pattern drawn from a binary alphabet, cannot be achieved unless the Strong Exponential Time Hypothesis (SETH) is false. This holds even if restricted to undirected graphs with maximum node degree 2—that is, to zig-zag matching in bidirectional strings, or to deterministic directed acyclic graphs whose nodes have maximum sum of indegree and outdegree 3. These restricted cases make the lower bound stricter than what can be directly derived from related bounds on regular expression matching (Backurs and Indyk, FOCS’16). In fact, our bounds are tight in the sense that lowering the degree or the alphabet size yields linear time solvable problems. An interesting corollary is that exact and approximate matching are equally hard (i.e., quadratic time) in graphs under SETH. In comparison, the same problems restricted to strings have linear time vs quadratic time solutions, respectively (approximate pattern matching having also a matching SETH lower bound (Backurs and Indyk, STOC’15)).","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"19 1","pages":"1 - 25"},"PeriodicalIF":1.3,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47712700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A PTAS for Capacitated Vehicle Routing on Trees","authors":"Claire Mathieu, Hang Zhou","doi":"https://dl.acm.org/doi/10.1145/3575799","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3575799","url":null,"abstract":"<p>We give a polynomial time approximation scheme (PTAS) for the unit demand capacitated vehicle routing problem (CVRP) on trees, for the entire range of the tour capacity. The result extends to the splittable CVRP.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 5","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carla Groenland, Gwenaël Joret, Wojciech Nadara, Bartosz Walczak
{"title":"Approximating Pathwidth for Graphs of Small Treewidth","authors":"Carla Groenland, Gwenaël Joret, Wojciech Nadara, Bartosz Walczak","doi":"https://dl.acm.org/doi/10.1145/3576044","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3576044","url":null,"abstract":"<p>We describe a polynomial-time algorithm which, given a graph <i>G</i> with treewidth <i>t</i>, approximates the pathwidth of <i>G</i> to within a ratio of (O(tsqrt {log t})). This is the first algorithm to achieve an <i>f(t)</i>-approximation for some function <i>f</i>.</p><p>Our approach builds on the following key insight: every graph with large pathwidth has large treewidth or contains a subdivision of a large complete binary tree. Specifically, we show that every graph with pathwidth at least <i>th</i>+2 has treewidth at least <i>t</i> or contains a subdivision of a complete binary tree of height <i>h</i>+1. The bound <i>th</i>+2 is best possible up to a multiplicative constant. This result was motivated by, and implies (with <i>c</i>=2), the following conjecture of Kawarabayashi and Rossman (SODA’18): there exists a universal constant <i>c</i> such that every graph with pathwidth Ω(<i>k</i><sup>c</sup>) has treewidth at least <i>k</i> or contains a subdivision of a complete binary tree of height <i>k</i>.</p><p>Our main technical algorithm takes a graph <i>G</i> and some (not necessarily optimal) tree decomposition of <i>G</i> of width <i>t</i>′ in the input, and it computes in polynomial time an integer <i>h</i>, a certificate that <i>G</i> has pathwidth at least <i>h</i>, and a path decomposition of <i>G</i> of width at most (<i>t</i>′+1)<i>h</i>+1. The certificate is closely related to (and implies) the existence of a subdivision of a complete binary tree of height <i>h</i>. The approximation algorithm for pathwidth is then obtained by combining this algorithm with the approximation algorithm of Feige, Hajiaghayi, and Lee (STOC’05) for treewidth.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 4","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Universal Algorithms for Clustering Problems","authors":"Arun Ganesh, Bruce M. Maggs, Debmalya Panigrahi","doi":"https://dl.acm.org/doi/10.1145/3572840","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3572840","url":null,"abstract":"<p>This article presents <i>universal</i> algorithms for clustering problems, including the widely studied <i>k</i>-median, <i>k</i>-means, and <i>k</i>-center objectives. The input is a metric space containing all <i>potential</i> client locations. The algorithm must select <i>k</i> cluster centers such that they are a good solution for <i>any</i> subset of clients that actually realize. Specifically, we aim for low <i>regret</i>, defined as the maximum over all subsets of the difference between the cost of the algorithm’s solution and that of an optimal solution. A universal algorithm’s solution <span>Sol</span> for a clustering problem is said to be an α , β-approximation if for all subsets of clients <i>C<sup>′</sup></i>, it satisfies <span>sol</span> (<i>C</i><sup>′</sup>) ≤ α ċ <span>opt</span> (<i>C</i>′) + β ċ <span>mr</span>, where <span>opt</span> (<i>C</i>′ is the cost of the optimal solution for clients (<i>C</i>′) and <span>mr</span> is the minimum regret achievable by any solution.</p><p>Our main results are universal algorithms for the standard clustering objectives of <i>k</i>-median, <i>k</i>-means, and <i>k</i>-center that achieve (<i>O</i>(1), <i>O</i>(1))-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other ℓ<i><sub>p</sub></i>-objectives and the setting where some subset of the clients are <i>fixed</i>. We also give hardness results showing that (α, β)-approximation is NP-hard if α or β is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, (<i>O</i>(1), <i>O</i>(1))-approximation is the strongest type of guarantee obtainable for universal clustering.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 7","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Algorithms for TSP and Steiner Tree","authors":"Arun Ganesh, Bruce M. Maggs, Debmalya Panigrahi","doi":"https://dl.acm.org/doi/10.1145/3570957","DOIUrl":"https://doi.org/https://dl.acm.org/doi/10.1145/3570957","url":null,"abstract":"<p>Robust optimization is a widely studied area in operations research, where the algorithm takes as input a range of values and outputs a single solution that performs well for the entire range. Specifically, a robust algorithm aims to minimize <i>regret</i>, defined as the maximum difference between the solution’s cost and that of an optimal solution in hindsight once the input has been realized. For graph problems in <b>P</b>, such as shortest path and minimum spanning tree, robust polynomial-time algorithms that obtain a constant approximation on regret are known. In this paper, we study robust algorithms for minimizing regret in <b>NP</b>-hard graph optimization problems, and give constant approximations on regret for the classical traveling salesman and Steiner tree problems.</p>","PeriodicalId":50922,"journal":{"name":"ACM Transactions on Algorithms","volume":"8 6","pages":""},"PeriodicalIF":1.3,"publicationDate":"2023-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138494887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}