Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff
{"title":"双 VC 维度阻碍嵌入式样本压缩","authors":"Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff","doi":"arxiv-2405.17120","DOIUrl":null,"url":null,"abstract":"This work studies embedding of arbitrary VC classes in well-behaved VC\nclasses, focusing particularly on extremal classes. Our main result expresses\nan impossibility: such embeddings necessarily require a significant increase in\ndimension. In particular, we prove that for every $d$ there is a class with VC\ndimension $d$ that cannot be embedded in any extremal class of VC dimension\nsmaller than exponential in $d$. In addition to its independent interest, this result has an important\nimplication in learning theory, as it reveals a fundamental limitation of one\nof the most extensively studied approaches to tackling the long-standing sample\ncompression conjecture. Concretely, the approach proposed by Floyd and Warmuth\nentails embedding any given VC class into an extremal class of a comparable\ndimension, and then applying an optimal sample compression scheme for extremal\nclasses. However, our results imply that this strategy would in some cases\nresult in a sample compression scheme at least exponentially larger than what\nis predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal\nclass with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is\nexponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which\napplies to general concept classes (and is known to be unimprovable for some\nclasses). We in fact prove a stronger result, establishing that $2d+1$ upper\nbounds the dual Radon number of extremal classes. This theorem represents an\nabstraction of the classical Radon theorem for convex sets, extending its\napplicability to a wider combinatorial framework, without relying on the\nspecifics of Euclidean convexity. The proof utilizes the topological method and\nis primarily based on variants of the Topological Radon Theorem.","PeriodicalId":501216,"journal":{"name":"arXiv - CS - Discrete Mathematics","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dual VC Dimension Obstructs Sample Compression by Embeddings\",\"authors\":\"Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff\",\"doi\":\"arxiv-2405.17120\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work studies embedding of arbitrary VC classes in well-behaved VC\\nclasses, focusing particularly on extremal classes. Our main result expresses\\nan impossibility: such embeddings necessarily require a significant increase in\\ndimension. In particular, we prove that for every $d$ there is a class with VC\\ndimension $d$ that cannot be embedded in any extremal class of VC dimension\\nsmaller than exponential in $d$. In addition to its independent interest, this result has an important\\nimplication in learning theory, as it reveals a fundamental limitation of one\\nof the most extensively studied approaches to tackling the long-standing sample\\ncompression conjecture. Concretely, the approach proposed by Floyd and Warmuth\\nentails embedding any given VC class into an extremal class of a comparable\\ndimension, and then applying an optimal sample compression scheme for extremal\\nclasses. However, our results imply that this strategy would in some cases\\nresult in a sample compression scheme at least exponentially larger than what\\nis predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal\\nclass with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is\\nexponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which\\napplies to general concept classes (and is known to be unimprovable for some\\nclasses). We in fact prove a stronger result, establishing that $2d+1$ upper\\nbounds the dual Radon number of extremal classes. This theorem represents an\\nabstraction of the classical Radon theorem for convex sets, extending its\\napplicability to a wider combinatorial framework, without relying on the\\nspecifics of Euclidean convexity. The proof utilizes the topological method and\\nis primarily based on variants of the Topological Radon Theorem.\",\"PeriodicalId\":501216,\"journal\":{\"name\":\"arXiv - CS - Discrete Mathematics\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Discrete Mathematics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.17120\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Discrete Mathematics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.17120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Dual VC Dimension Obstructs Sample Compression by Embeddings
This work studies embedding of arbitrary VC classes in well-behaved VC
classes, focusing particularly on extremal classes. Our main result expresses
an impossibility: such embeddings necessarily require a significant increase in
dimension. In particular, we prove that for every $d$ there is a class with VC
dimension $d$ that cannot be embedded in any extremal class of VC dimension
smaller than exponential in $d$. In addition to its independent interest, this result has an important
implication in learning theory, as it reveals a fundamental limitation of one
of the most extensively studied approaches to tackling the long-standing sample
compression conjecture. Concretely, the approach proposed by Floyd and Warmuth
entails embedding any given VC class into an extremal class of a comparable
dimension, and then applying an optimal sample compression scheme for extremal
classes. However, our results imply that this strategy would in some cases
result in a sample compression scheme at least exponentially larger than what
is predicted by the sample compression conjecture. The above implications follow from a general result we prove: any extremal
class with VC dimension $d$ has dual VC dimension at most $2d+1$. This bound is
exponentially smaller than the classical bound $2^{d+1}-1$ of Assouad, which
applies to general concept classes (and is known to be unimprovable for some
classes). We in fact prove a stronger result, establishing that $2d+1$ upper
bounds the dual Radon number of extremal classes. This theorem represents an
abstraction of the classical Radon theorem for convex sets, extending its
applicability to a wider combinatorial framework, without relying on the
specifics of Euclidean convexity. The proof utilizes the topological method and
is primarily based on variants of the Topological Radon Theorem.