Simpler Optimal Sorting from a Directed Acyclic Graph

arXiv - CS - Data Structures and Algorithms Pub Date : 2024-07-31 DOI:arxiv-2407.21591

Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann

{"title":"Simpler Optimal Sorting from a Directed Acyclic Graph","authors":"Ivor van der Hoog, Eva Rotenberg, Daniel Rutschmann","doi":"arxiv-2407.21591","DOIUrl":null,"url":null,"abstract":"Fredman proposed in 1976 the following algorithmic problem: Given are a\nground set $X$, some partial order $P$ over $X$, and some comparison oracle\n$O_L$ that specifies a linear order $L$ over $X$ that extends $P$. A query to\n$O_L$ has as input distinct $x, x' \\in X$ and outputs whether $x <_L x'$ or\nvice versa. If we denote by $e(P)$ the number of linear extensions of $P$, then\n$\\log e(P)$ is a worst-case lower bound on the number of queries needed to\noutput the sorted order of $X$. Fredman did not specify in what form the partial order is given. Haeupler,\nHlad\\'ik, Iacono, Rozhon, Tarjan, and T\\v{e}tek ('24) propose to assume as\ninput a directed acyclic graph, $G$, with $m$ edges and $n=|X|$ vertices.\nDenote by $P_G$ the partial order induced by $G$. Algorithmic performance is\nmeasured in running time and the number of queries used, where they use\n$\\Theta(m + n + \\log e(P_G))$ time and $\\Theta(\\log e(P_G))$ queries to output\n$X$ in its sorted order. Their algorithm is worst-case optimal in terms of\nrunning time and queries, both. Their algorithm combines topological sorting\nwith heapsort, and uses sophisticated data structures (including a recent type\nof heap with a working-set bound). Their analysis relies upon sophisticated\ncounting arguments using entropy, recursively defined sets defined over the run\nof their algorithm, and vertices in the graph that they identify as bottlenecks\nfor sorting. In this paper, we do away with sophistication. We show that when the input is\na directed acyclic graph then the problem admits a simple solution using\n$\\Theta(m + n + \\log e(P_G))$ time and $\\Theta(\\log e(P_G))$ queries.\nEspecially our proofs are much simpler as we avoid the usage of advanced\ncharging arguments and data structures, and instead rely upon two brief\nobservations.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"150 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.21591","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Fredman proposed in 1976 the following algorithmic problem: Given are a ground set $X$, some partial order $P$ over $X$, and some comparison oracle $O_L$ that specifies a linear order $L$ over $X$ that extends $P$. A query to $O_L$ has as input distinct $x, x' \in X$ and outputs whether $x <_L x'$ or vice versa. If we denote by $e(P)$ the number of linear extensions of $P$, then $\log e(P)$ is a worst-case lower bound on the number of queries needed to output the sorted order of $X$. Fredman did not specify in what form the partial order is given. Haeupler, Hlad\'ik, Iacono, Rozhon, Tarjan, and T\v{e}tek ('24) propose to assume as input a directed acyclic graph, $G$, with $m$ edges and $n=|X|$ vertices. Denote by $P_G$ the partial order induced by $G$. Algorithmic performance is measured in running time and the number of queries used, where they use $\Theta(m + n + \log e(P_G))$ time and $\Theta(\log e(P_G))$ queries to output $X$ in its sorted order. Their algorithm is worst-case optimal in terms of running time and queries, both. Their algorithm combines topological sorting with heapsort, and uses sophisticated data structures (including a recent type of heap with a working-set bound). Their analysis relies upon sophisticated counting arguments using entropy, recursively defined sets defined over the run of their algorithm, and vertices in the graph that they identify as bottlenecks for sorting. In this paper, we do away with sophistication. We show that when the input is a directed acyclic graph then the problem admits a simple solution using $\Theta(m + n + \log e(P_G))$ time and $\Theta(\log e(P_G))$ queries. Especially our proofs are much simpler as we avoid the usage of advanced charging arguments and data structures, and instead rely upon two brief observations.

查看原文本刊更多论文

从有向无环图进行更简单的优化排序

弗雷德曼在 1976 年提出了以下算法问题：给定有一个集合 $X$、一些关于 $X$ 的部分秩 $P$，以及一些比较甲骨文$O_L$，其中指定了一个关于 $X$ 的线性秩 $L$，该秩扩展了 $P$。对$O_L$的查询输入X$中不同的$x, x'，输出$x <_L x'$或反之。如果我们用$e(P)$表示$P$的线性扩展的数量，那么$\log e(P)$就是输出$X$的排序顺序所需的查询次数的最坏情况下的下限。弗雷德曼没有说明部分秩是以何种形式给出的。Haeupler、Hlad\'ik 、Iacono、Rozhon、Tarjan 和 T\v{e}tek （'24）提议假设输入的是一个有向无环图 $G$，该图有 $m$ 边和 $n=|X|$ 顶点。算法性能用运行时间和使用的查询次数来衡量，其中他们使用了$heta(m + n + \log e(P_G))$时间和$\theta(\log e(P_G))$查询来按排序输出$X$。就运行时间和查询次数而言，他们的算法在最坏情况下都是最优的。他们的算法结合了拓扑排序和堆排序，并使用了复杂的数据结构（包括一种具有工作集约束的最新堆类型）。他们的分析依赖于使用熵的复杂计数论据、算法运行过程中定义的递归集，以及他们确定为排序瓶颈的图中顶点。在本文中，我们摒弃了复杂性。我们证明，当输入是一个有向无环图时，这个问题可以用$\θ(m + n + \log e(P_G))$时间和$\θ(\log e(P_G))$查询得到简单的解决。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Data Structures and Algorithms

自引率

0.00%

发文量