Sjoerd Dirksen, Shahar Mendelson, Alexander Stollenwerk
{"title":"快速度量嵌入汉明立方体","authors":"Sjoerd Dirksen, Shahar Mendelson, Alexander Stollenwerk","doi":"10.1137/22m1520220","DOIUrl":null,"url":null,"abstract":"SIAM Journal on Computing, Volume 53, Issue 2, Page 315-345, April 2024. <br/> Abstract. We consider the problem of embedding a subset of [math] into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability; we first apply a specific structured random matrix, which we call the double circulant matrix; using that a matrix requires linear storage and matrix-vector multiplication that can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set: its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for, up to logarithmic terms. The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of the Gaussian matrix in two important ways. First, it maps an arbitrary set in [math] into a set of well-spread vectors. Second, it yields a fast near-isometric embedding of any finite subset of [math] into [math]. This embedding achieves the same dimension reduction as the Gaussian matrix in near-linear time, under an optimal condition—up to logarithmic factors—on the number of points to be embedded. This improves a well-known construction due to Ailon and Chazelle.","PeriodicalId":49532,"journal":{"name":"SIAM Journal on Computing","volume":"21 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast Metric Embedding into the Hamming Cube\",\"authors\":\"Sjoerd Dirksen, Shahar Mendelson, Alexander Stollenwerk\",\"doi\":\"10.1137/22m1520220\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"SIAM Journal on Computing, Volume 53, Issue 2, Page 315-345, April 2024. <br/> Abstract. We consider the problem of embedding a subset of [math] into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability; we first apply a specific structured random matrix, which we call the double circulant matrix; using that a matrix requires linear storage and matrix-vector multiplication that can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set: its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for, up to logarithmic terms. The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of the Gaussian matrix in two important ways. First, it maps an arbitrary set in [math] into a set of well-spread vectors. Second, it yields a fast near-isometric embedding of any finite subset of [math] into [math]. This embedding achieves the same dimension reduction as the Gaussian matrix in near-linear time, under an optimal condition—up to logarithmic factors—on the number of points to be embedded. This improves a well-known construction due to Ailon and Chazelle.\",\"PeriodicalId\":49532,\"journal\":{\"name\":\"SIAM Journal on Computing\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-03-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM Journal on Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1137/22m1520220\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM Journal on Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1137/22m1520220","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
SIAM Journal on Computing, Volume 53, Issue 2, Page 315-345, April 2024. Abstract. We consider the problem of embedding a subset of [math] into a low-dimensional Hamming cube in an almost isometric way. We construct a simple, data-oblivious, and computationally efficient map that achieves this task with high probability; we first apply a specific structured random matrix, which we call the double circulant matrix; using that a matrix requires linear storage and matrix-vector multiplication that can be performed in near-linear time. We then binarize each vector by comparing each of its entries to a random threshold, selected uniformly at random from a well-chosen interval. We estimate the number of bits required for this encoding scheme in terms of two natural geometric complexity parameters of the set: its Euclidean covering numbers and its localized Gaussian complexity. The estimate we derive turns out to be the best that one can hope for, up to logarithmic terms. The key to the proof is a phenomenon of independent interest: we show that the double circulant matrix mimics the behavior of the Gaussian matrix in two important ways. First, it maps an arbitrary set in [math] into a set of well-spread vectors. Second, it yields a fast near-isometric embedding of any finite subset of [math] into [math]. This embedding achieves the same dimension reduction as the Gaussian matrix in near-linear time, under an optimal condition—up to logarithmic factors—on the number of points to be embedded. This improves a well-known construction due to Ailon and Chazelle.
期刊介绍:
The SIAM Journal on Computing aims to provide coverage of the most significant work going on in the mathematical and formal aspects of computer science and nonnumerical computing. Submissions must be clearly written and make a significant technical contribution. Topics include but are not limited to analysis and design of algorithms, algorithmic game theory, data structures, computational complexity, computational algebra, computational aspects of combinatorics and graph theory, computational biology, computational geometry, computational robotics, the mathematical aspects of programming languages, artificial intelligence, computational learning, databases, information retrieval, cryptography, networks, distributed computing, parallel algorithms, and computer architecture.