{"title":"Bounds on the QAC0 Complexity of Approximating Parity","authors":"Gregory Rosenthal","doi":"10.4230/LIPIcs.ITCS.2021.32","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.32","url":null,"abstract":"QAC circuits are quantum circuits with one-qubit gates and Toffoli gates of arbitrary arity. QAC$^0$ circuits are QAC circuits of constant depth, and are quantum analogues of AC$^0$ circuits. We prove the following: $bullet$ For all $d ge 7$ and $varepsilon>0$ there is a depth-$d$ QAC circuit of size $exp(mathrm{poly}(n^{1/d}) log(n/varepsilon))$ that approximates the $n$-qubit parity function to within error $varepsilon$ on worst-case quantum inputs. Previously it was unknown whether QAC circuits of sublogarithmic depth could approximate parity regardless of size. $bullet$ We introduce a class of \"mostly classical\" QAC circuits, including a major component of our circuit from the above upper bound, and prove a tight lower bound on the size of low-depth, mostly classical QAC circuits that approximate this component. $bullet$ Arbitrary depth-$d$ QAC circuits require at least $Omega(n/d)$ multi-qubit gates to achieve a $1/2 + exp(-o(n/d))$ approximation of parity. When $d = Theta(log n)$ this nearly matches an easy $O(n)$ size upper bound for computing parity exactly. $bullet$ QAC circuits with at most two layers of multi-qubit gates cannot achieve a $1/2 + exp(-o(n))$ approximation of parity, even non-cleanly. Previously it was known only that such circuits could not cleanly compute parity exactly for sufficiently large $n$. The proofs use a new normal form for quantum circuits which may be of independent interest, and are based on reductions to the problem of constructing certain generalizations of the cat state which we name \"nekomata\" after an analogous cat y=okai.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125462654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation","authors":"Cameron Musco, C. Musco, David P. Woodruff","doi":"10.4230/LIPIcs.ITCS.2021.6","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.6","url":null,"abstract":"In $masked low-rank approximation$, one is given $A in mathbb{R}^{n times n}$ and binary mask matrix $W in {0,1}^{n times n}$. The goal is to find a rank-$k$ matrix $L$ for which: $$cost(L) = sum_{i=1}^{n} sum_{j = 1}^{n} W_{i,j} cdot (A_{i,j} - L_{i,j} )^2 leq OPT + epsilon |A|_F^2 ,$$ where $OPT = min_{rank-k hat{L}} cost(hat L)$ and $epsilon$ is a given error parameter. Depending on the choice of $W$, this problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, and many problems. Many of these problems are NP-hard, and while some algorithms with provable guarantees are known, they either 1) run in time $n^{Omega(k^2/epsilon)}$ or 2) make strong assumptions, e.g., that $A$ is incoherent or that $W$ is random. In this work, we show that a common polynomial time heuristic, which simply sets $A$ to $0$ where $W$ is $0$, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank $k' > k$ depending on the $public coin partition number$ of $W$, the heuristic outputs rank-$k'$ $L$ with cost$(L) leq OPT + epsilon |A|_F^2$. This partition number is in turn bounded by the $randomized communication complexity$ of $W$, when interpreted as a two-player communication matrix. For many important examples of masked low-rank approximation, including all those listed above, this result yields bicriteria approximation guarantees with $k' = k cdot poly(log n/epsilon)$. Further, we show that different models of communication yield algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128201460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Charikar, Shivam Garg, D. Gordon, Kirankumar Shiragur
{"title":"A Model for Ant Trail Formation and its Convergence Properties","authors":"M. Charikar, Shivam Garg, D. Gordon, Kirankumar Shiragur","doi":"10.4230/LIPIcs.ITCS.2021.85","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.85","url":null,"abstract":"We introduce a model for ant trail formation, building upon previous work on biologically feasible local algorithms that plausibly describe how ants maintain trail networks. The model is a variant of a reinforced random walk on a directed graph, where ants lay pheromone on edges as they traverse them and the next edge to traverse is chosen based on the pheromone level; this pheromone decays with time. There is a bidirectional flow of ants: the forward flow proceeds along forward edges from source (e.g. the nest) to sink (e.g. a food source), and the backward flow in the opposite direction. Some fraction of ants are lost as they pass through each node (modeling the loss of ants due to exploration). We initiate a theoretical study of this model. \u0000We first consider the linear decision rule, where the flow divides itself among the next set of edges in proportion to their pheromone level. Here, we show that the process converges to the path with minimum leakage when the forward and backward flows do not change over time. When the forward and backward flows increase over time (caused by positive reinforcement from the discovery of a food source, for example), we show that the process converges to the shortest path. These results are for graphs consisting of two parallel paths (a case that has been investigated before in experiments). Through simulations, we show that these results hold for more general graphs drawn from various random graph models. Further, we consider a general family of decision rules, and show that there is no advantage of using a non-linear rule from this family, if the goal is to find the shortest or the minimum leakage path. We also show that bidirectional flow is necessary for convergence to such paths. Our results provide a plausible explanation for field observations, and open up new avenues for further theoretical and experimental investigation.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"149 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132316048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How to Sell Information Optimally: an Algorithmic Study","authors":"Yang Cai, Grigoris Velegkas","doi":"10.4230/LIPIcs.ITCS.2021.81","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.81","url":null,"abstract":"We investigate the algorithmic problem of selling information to agents who face a decision-making problem under uncertainty. We adopt the model recently proposed by Bergemann et al. [BBS18], in which information is revealed through signaling schemes called experiments. In the single-agent setting, any mechanism can be represented as a menu of experiments. Our results show that the computational complexity of designing the revenue-optimal menu depends heavily on the way the model is specified. When all the parameters of the problem are given explicitly, we provide a polynomial time algorithm that computes the revenue-optimal menu. For cases where the model is specified with a succinct implicit description, we show that the tractability of the problem is tightly related to the efficient implementation of a Best Response Oracle: when it can be implemented efficiently, we provide an additive FPTAS whose running time is independent of the number of actions. On the other hand, we provide a family of problems, where it is computationally intractable to construct a best response oracle, and we show that it is NP-hard to get even a constant fraction of the optimal revenue. Moreover, we investigate a generalization of the original model by Bergemann et al. [BBS18] that allows multiple agents to compete for useful information. We leverage techniques developed in the study of auction design (see e.g. [CDW12a], [AFHHM12], [CDW12b], [CDW13a], [CDW13b]) to design a polynomial time algorithm that computes the revenue-optimal mechanism for selling information.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"329 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133549613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Inference in Probabilistic Graphical Models","authors":"Weiming Feng, Kun He, Xiaoming Sun, Yitong Yin","doi":"10.4230/LIPIcs.ITCS.2021.25","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.25","url":null,"abstract":"Probabilistic graphical models, such as Markov random fields (MRFs), are useful for describing high-dimensional distributions in terms of local dependence structures. The probabilistic inference is a fundamental problem related to graphical models, and sampling is a main approach for the problem. In this paper, we study probabilistic inference problems when the graphical model itself is changing dynamically with time. Such dynamic inference problems arise naturally in today's application, e.g.~multivariate time-series data analysis and practical learning procedures. We give a dynamic algorithm for sampling-based probabilistic inferences in MRFs, where each dynamic update can change the underlying graph and all parameters of the MRF simultaneously, as long as the total amount of changes is bounded. More precisely, suppose that the MRF has $n$ variables and polylogarithmic-bounded maximum degree, and $N(n)$ independent samples are sufficient for the inference for a polynomial function $N(cdot)$. Our algorithm dynamically maintains an answer to the inference problem using $widetilde{O}(n N(n))$ space cost, and $widetilde{O}(N(n) + n)$ incremental time cost upon each update to the MRF, as long as the well-known Dobrushin-Shlosman condition is satisfied by the MRFs. Compared to the static case, which requires $Omega(n N(n))$ time cost for redrawing all $N(n)$ samples whenever the MRF changes, our dynamic algorithm gives a $widetildeOmega(min{n, N(n)})$-factor speedup. Our approach relies on a novel dynamic sampling technique, which transforms local Markov chains (a.k.a. single-site dynamics) to dynamic sampling algorithms, and an \"algorithmic Lipschitz\" condition that we establish for sampling from graphical models, namely, when the MRF changes by a small difference, samples can be modified to reflect the new distribution, with cost proportional to the difference on MRF.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"370 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122764892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Surbhi Goel, Adam R. Klivans, Pasin Manurangsi, Daniel Reichman
{"title":"Tight Hardness Results for Training Depth-2 ReLU Networks","authors":"Surbhi Goel, Adam R. Klivans, Pasin Manurangsi, Daniel Reichman","doi":"10.4230/LIPIcs.ITCS.2021.22","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.22","url":null,"abstract":"We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that minimizes the square loss with respect to a given training set. We prove that this problem is NP-hard already for a network with a single ReLU. We also prove NP-hardness for outputting a weighted sum of $k$ ReLUs minimizing the squared error (for $k>1$) even in the realizable setting (i.e., when the labels are consistent with an unknown depth-2 ReLU network). We are also able to obtain lower bounds on the running time in terms of the desired additive error $epsilon$. To obtain our lower bounds, we use the Gap Exponential Time Hypothesis (Gap-ETH) as well as a new hypothesis regarding the hardness of approximating the well known Densest $kappa$-Subgraph problem in subexponential time (these hypotheses are used separately in proving different lower bounds). For example, we prove that under reasonable hardness assumptions, any proper learning algorithm for finding the best fitting ReLU must run in time exponential in $1/epsilon^2$. Together with a previous work regarding improperly learning a ReLU (Goel et al., COLT'17), this implies the first separation between proper and improper algorithms for learning a ReLU. We also study the problem of properly learning a depth-2 network of ReLUs with bounded weights giving new (worst-case) upper bounds on the running time needed to learn such networks both in the realizable and agnostic settings. Our upper bounds on the running time essentially matches our lower bounds in terms of the dependency on $epsilon$.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132567707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Online Paging with a Vanishing Regret","authors":"Y. Emek, S. Kutten, Yangguang Shi","doi":"10.4230/LIPIcs.ITCS.2021.67","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.67","url":null,"abstract":"This paper considers a variant of the online paging problem, where the online algorithm has access to multiple predictors, each producing a sequence of predictions for the page arrival times. The predictors may have occasional prediction errors and it is assumed that at least one of them makes a sublinear number of prediction errors in total. Our main result states that this assumption suffices for the design of a randomized online algorithm whose time-average regret with respect to the optimal offline algorithm tends to zero as the time tends to infinity. This holds (with different regret bounds) for both the full information access model, where in each round, the online algorithm gets the predictions of all predictors, and the bandit access model, where in each round, the online algorithm queries a single predictor. \u0000While online algorithms that exploit inaccurate predictions have been a topic of growing interest in the last few years, to the best of our knowledge, this is the first paper that studies this topic in the context of multiple predictors. Moreover, to the best of our knowledge, this is also the first paper that aims for (and achieves) online algorithms with a vanishing regret for a classic online problem under reasonable assumptions.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128647489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Relative Lipschitzness in Extragradient Methods and a Direct Recipe for Acceleration","authors":"Michael B. Cohen, Aaron Sidford, Kevin Tian","doi":"10.4230/LIPIcs.ITCS.2021.62","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.62","url":null,"abstract":"We show that standard extragradient methods (i.e. mirror prox and dual extrapolation) recover optimal accelerated rates for first-order minimization of smooth convex functions. To obtain this result we provide fine-grained characterization of the convergence rates of extragradient methods for solving monotone variational inequalities in terms of a natural condition we call relative Lipschitzness. We further generalize this framework to handle local and randomized notions of relative Lipschitzness and thereby recover rates for box-constrained $ell_infty$ regression based on area convexity and complexity bounds achieved by accelerated (randomized) coordinate descent for smooth convex function minimization.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128443478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Strongish Planted Clique Hypothesis and Its Consequences","authors":"Pasin Manurangsi, A. Rubinstein, T. Schramm","doi":"10.4230/LIPIcs.ITCS.2021.10","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2021.10","url":null,"abstract":"We formulate a new hardness assumption, the Strongish Planted Clique Hypothesis (SPCH), which postulates that any algorithm for planted clique must run in time $n^{Omega(log{n})}$ (so that the state-of-the-art running time of $n^{O(log n)}$ is optimal up to a constant in the exponent). \u0000We provide two sets of applications of the new hypothesis. First, we show that SPCH implies (nearly) tight inapproximability results for the following well-studied problems in terms of the parameter $k$: Densest $k$-Subgraph, Smallest $k$-Edge Subgraph, Densest $k$-Subhypergraph, Steiner $k$-Forest, and Directed Steiner Network with $k$ terminal pairs. For example, we show, under SPCH, that no polynomial time algorithm achieves $o(k)$-approximation for Densest $k$-Subgraph. This inapproximability ratio improves upon the previous best $k^{o(1)}$ factor from (Chalermsook et al., FOCS 2017). Furthermore, our lower bounds hold even against fixed-parameter tractable algorithms with parameter $k$. \u0000Our second application focuses on the complexity of graph pattern detection. For both induced and non-induced graph pattern detection, we prove hardness results under SPCH, which improves the running time lower bounds obtained by (Dalirrooyfard et al., STOC 2019) under the Exponential Time Hypothesis.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115693261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Henzinger, Billy Jin, Richard Peng, David P. Williamson
{"title":"A Combinatorial Cut-Toggling Algorithm for Solving Laplacian Linear Systems","authors":"M. Henzinger, Billy Jin, Richard Peng, David P. Williamson","doi":"10.4230/LIPIcs.ITCS.2023.69","DOIUrl":"https://doi.org/10.4230/LIPIcs.ITCS.2023.69","url":null,"abstract":"Over the last two decades, a significant line of work in theoretical algorithms has made progress in solving linear systems whose coefficient matrix is the Laplacian matrix of a weighted graph. The solution of the linear system can be interpreted as the potentials of an electrical flow. Kelner, Orrechia, Sidford, and Zhu (STOC 2013) give a combinatorial, near-linear time algorithm that maintains the Kirchoff Current Law, and gradually enforces the Kirchoff Potential Law by updating flows around cycles (cycle toggling). In this paper, we consider a dual version of the algorithm that maintains the Kirchoff Potential Law, and gradually enforces the Kirchoff Current Law by cut toggling: each iteration updates all potentials on one side of a fundamental cut of a spanning tree by the same amount. We prove that this dual algorithm also runs in a near-linear number of iterations. We show, however, that if we abstract cut toggling as a natural data structure problem, this problem can be reduced to the online vector-matrix-vector problem, which has been conjectured to be difficult for dynamic algorithms by Henzinger, Krinninger, Nanongkai, and Saranurak (STOC 2015). The conjecture implies that a straightforward implementation of the cut-toggling algorithm requires essentially linear time per iteration. To circumvent the lower bound, we batch update steps, and perform them simultaneously instead of sequentially. An appropriate choice of batching leads to an $tilde{O}(m^{1.5})$ time cut-toggling algorithm for solving Laplacian systems. Furthermore, if we sparsify the graph and call our algorithm recursively on the Laplacian system implied by batching and sparsifying, the running time can be reduced to $O(m^{1+epsilon})$ for any $epsilon>0$. Thus, the dual cut-toggling algorithm can achieve (almost) the same running time as its primal cycle-toggling counterpart.","PeriodicalId":123734,"journal":{"name":"Information Technology Convergence and Services","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133297949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}