Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff
{"title":"Dual VC Dimension Obstructs Sample Compression by Embeddings","authors":"Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff","doi":"arxiv-2405.17120","DOIUrl":null,"url":null,"abstract":"This work studies embedding of arbitrary VC classes in well-behaved VC\nclasses, focusing particularly on extremal classes. Our main result expresses\nan impossibility: such embeddings necessarily require a significant increase in\ndimension. In particular, we prove that for every $d$ there is a class with VC\ndimension $d$ that cannot be embedded in any extremal class of VC dimension\nsmaller than exponential in $d$. In addition to its independent interest, this result has an important\nimplication in learning theory, as it reveals a fundamental limitation of one\nof the most extensively studied approaches to tackling the long-standing sample\ncompression conjecture. Concretely, the approach proposed by Floyd and Warmuth\nentails embedding any given VC class into an extremal class of a comparable\ndimension, and then applying an optimal sample compression scheme for extremal\nclasses. However, our results imply that this strategy would in some cases\nresult in a sample compression scheme at least exponentially larger than what\nis predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal\nclass with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is\nexponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which\napplies to general concept classes (and is known to be unimprovable for some\nclasses). We in fact prove a stronger result, establishing that $2d+1$ upper\nbounds the dual Radon number of extremal classes. This theorem represents an\nabstraction of the classical Radon theorem for convex sets, extending its\napplicability to a wider combinatorial framework, without relying on the\nspecifics of Euclidean convexity. The proof utilizes the topological method and\nis primarily based on variants of the Topological Radon Theorem.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Discrete Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.17120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This work studies embedding of arbitrary VC classes in well-behaved VC
classes, focusing particularly on extremal classes. Our main result expresses
an impossibility: such embeddings necessarily require a significant increase in
dimension. In particular, we prove that for every $d$ there is a class with VC
dimension $d$ that cannot be embedded in any extremal class of VC dimension
smaller than exponential in $d$. In addition to its independent interest, this result has an important
implication in learning theory, as it reveals a fundamental limitation of one
of the most extensively studied approaches to tackling the long-standing sample
compression conjecture. Concretely, the approach proposed by Floyd and Warmuth
entails embedding any given VC class into an extremal class of a comparable
dimension, and then applying an optimal sample compression scheme for extremal
classes. However, our results imply that this strategy would in some cases
result in a sample compression scheme at least exponentially larger than what
is predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal
class with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is
exponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which
applies to general concept classes (and is known to be unimprovable for some
classes). We in fact prove a stronger result, establishing that $2d+1$ upper
bounds the dual Radon number of extremal classes. This theorem represents an
abstraction of the classical Radon theorem for convex sets, extending its
applicability to a wider combinatorial framework, without relying on the
specifics of Euclidean convexity. The proof utilizes the topological method and
is primarily based on variants of the Topological Radon Theorem.