Dual VC Dimension Obstructs Sample Compression by Embeddings

arXiv - CS - Discrete Mathematics Pub Date : 2024-05-27 DOI:arxiv-2405.17120

Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff

{"title":"Dual VC Dimension Obstructs Sample Compression by Embeddings","authors":"Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff","doi":"arxiv-2405.17120","DOIUrl":null,"url":null,"abstract":"This work studies embedding of arbitrary VC classes in well-behaved VC\nclasses, focusing particularly on extremal classes. Our main result expresses\nan impossibility: such embeddings necessarily require a significant increase in\ndimension. In particular, we prove that for every $d$ there is a class with VC\ndimension $d$ that cannot be embedded in any extremal class of VC dimension\nsmaller than exponential in $d$. In addition to its independent interest, this result has an important\nimplication in learning theory, as it reveals a fundamental limitation of one\nof the most extensively studied approaches to tackling the long-standing sample\ncompression conjecture. Concretely, the approach proposed by Floyd and Warmuth\nentails embedding any given VC class into an extremal class of a comparable\ndimension, and then applying an optimal sample compression scheme for extremal\nclasses. However, our results imply that this strategy would in some cases\nresult in a sample compression scheme at least exponentially larger than what\nis predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal\nclass with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is\nexponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which\napplies to general concept classes (and is known to be unimprovable for some\nclasses). We in fact prove a stronger result, establishing that $2d+1$ upper\nbounds the dual Radon number of extremal classes. This theorem represents an\nabstraction of the classical Radon theorem for convex sets, extending its\napplicability to a wider combinatorial framework, without relying on the\nspecifics of Euclidean convexity. The proof utilizes the topological method and\nis primarily based on variants of the Topological Radon Theorem.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Discrete Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.17120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This work studies embedding of arbitrary VC classes in well-behaved VC classes, focusing particularly on extremal classes. Our main result expresses an impossibility: such embeddings necessarily require a significant increase in dimension. In particular, we prove that for every $d$ there is a class with VC dimension $d$ that cannot be embedded in any extremal class of VC dimension smaller than exponential in $d$. In addition to its independent interest, this result has an important implication in learning theory, as it reveals a fundamental limitation of one of the most extensively studied approaches to tackling the long-standing sample compression conjecture. Concretely, the approach proposed by Floyd and Warmuth entails embedding any given VC class into an extremal class of a comparable dimension, and then applying an optimal sample compression scheme for extremal classes. However, our results imply that this strategy would in some cases result in a sample compression scheme at least exponentially larger than what is predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal class with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is exponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which applies to general concept classes (and is known to be unimprovable for some classes). We in fact prove a stronger result, establishing that $2d+1$ upper bounds the dual Radon number of extremal classes. This theorem represents an abstraction of the classical Radon theorem for convex sets, extending its applicability to a wider combinatorial framework, without relying on the specifics of Euclidean convexity. The proof utilizes the topological method and is primarily based on variants of the Topological Radon Theorem.

查看原文本刊更多论文

双 VC 维度阻碍嵌入式样本压缩

这项工作研究的是任意 VC 类在良好 VC 类中的嵌入，尤其侧重于极值类。我们的主要结果表明了不可能性：这种嵌入必然要求显著增加维数。特别是，我们证明，对于每一个维度为 $d$ 的类，都有一个维度为 $d$ 的类无法嵌入到维度小于 $d$ 指数的任何极值类中。除了其独立的意义之外，这一结果在学习理论中也有重要的影响，因为它揭示了解决长期存在的样本压缩猜想的最广泛研究方法之一的根本局限性。具体来说，Floyd 和 Warmuthen 提出的方法是将任何给定的 VC 类嵌入到一个具有可比维度的极值类中，然后为极值类应用最优样本压缩方案。然而，我们的结果表明，这种策略在某些情况下会导致样本压缩方案比样本压缩猜想所预测的至少大指数级。上述影响来自我们证明的一个一般结果：任何具有 VC 维度 $d$ 的极值类，其对偶 VC 维度最多为 2d+1$。这个界限比阿苏阿德的经典界限 2^{d+1}-1$ 成指数地小，后者适用于一般概念类（已知对某些类是不可改进的）。事实上，我们证明了一个更强的结果，即 2d+1$ 是极值类的对偶拉顿数的上限。该定理是对凸集经典 Radon 定理的抽象，将其适用性扩展到更广泛的组合框架，而不依赖于欧几里得凸性的特异性。证明采用拓扑方法，主要基于拓扑拉顿定理的变体。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Discrete Mathematics

自引率

0.00%

发文量