Metric Entropy Duality and the Sample Complexity of Outcome Indistinguishability

Lunjia Hu, Charlotte Peale, Omer Reingold
{"title":"Metric Entropy Duality and the Sample Complexity of Outcome Indistinguishability","authors":"Lunjia Hu, Charlotte Peale, Omer Reingold","doi":"10.48550/arXiv.2203.04536","DOIUrl":null,"url":null,"abstract":"We give the first sample complexity characterizations for outcome indistinguishability, a theoretical framework of machine learning recently introduced by Dwork, Kim, Reingold, Rothblum, and Yona (STOC 2021). In outcome indistinguishability, the goal of the learner is to output a predictor that cannot be distinguished from the target predictor by a class $D$ of distinguishers examining the outcomes generated according to the predictors' predictions. In the distribution-specific and realizable setting where the learner is given the data distribution together with a predictor class $P$ containing the target predictor, we show that the sample complexity of outcome indistinguishability is characterized by the metric entropy of $P$ w.r.t. the dual Minkowski norm defined by $D$, and equivalently by the metric entropy of $D$ w.r.t. the dual Minkowski norm defined by $P$. This equivalence makes an intriguing connection to the long-standing metric entropy duality conjecture in convex geometry. Our sample complexity characterization implies a variant of metric entropy duality, which we show is nearly tight. In the distribution-free setting, we focus on the case considered by Dwork et al. where $P$ contains all possible predictors, hence the sample complexity only depends on $D$. In this setting, we show that the sample complexity of outcome indistinguishability is characterized by the fat-shattering dimension of $D$. We also show a strong sample complexity separation between realizable and agnostic outcome indistinguishability in both the distribution-free and the distribution-specific settings. This is in contrast to distribution-free (resp. distribution-specific) PAC learning where the sample complexity in both the realizable and the agnostic settings can be characterized by the VC dimension (resp. metric entropy).","PeriodicalId":267197,"journal":{"name":"International Conference on Algorithmic Learning Theory","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Algorithmic Learning Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2203.04536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

We give the first sample complexity characterizations for outcome indistinguishability, a theoretical framework of machine learning recently introduced by Dwork, Kim, Reingold, Rothblum, and Yona (STOC 2021). In outcome indistinguishability, the goal of the learner is to output a predictor that cannot be distinguished from the target predictor by a class $D$ of distinguishers examining the outcomes generated according to the predictors' predictions. In the distribution-specific and realizable setting where the learner is given the data distribution together with a predictor class $P$ containing the target predictor, we show that the sample complexity of outcome indistinguishability is characterized by the metric entropy of $P$ w.r.t. the dual Minkowski norm defined by $D$, and equivalently by the metric entropy of $D$ w.r.t. the dual Minkowski norm defined by $P$. This equivalence makes an intriguing connection to the long-standing metric entropy duality conjecture in convex geometry. Our sample complexity characterization implies a variant of metric entropy duality, which we show is nearly tight. In the distribution-free setting, we focus on the case considered by Dwork et al. where $P$ contains all possible predictors, hence the sample complexity only depends on $D$. In this setting, we show that the sample complexity of outcome indistinguishability is characterized by the fat-shattering dimension of $D$. We also show a strong sample complexity separation between realizable and agnostic outcome indistinguishability in both the distribution-free and the distribution-specific settings. This is in contrast to distribution-free (resp. distribution-specific) PAC learning where the sample complexity in both the realizable and the agnostic settings can be characterized by the VC dimension (resp. metric entropy).
度量熵对偶性与结果不可区分性的样本复杂度
我们给出了结果不可区分性的第一个样本复杂性特征,这是最近由Dwork, Kim, Reingold, Rothblum和Yona (STOC 2021)引入的机器学习理论框架。在结果不可区分性中,学习器的目标是输出一个无法与目标预测器区分的预测器,通过一组区分器检查根据预测器预测产生的结果。在给定数据分布和包含目标预测器的预测类$P$的分布可实现设置中,我们证明了结果不可区分性的样本复杂度由$P$ w.r.t.(由$D$定义的对偶闵可夫斯基范数)和$D$ w.r.t.(由$P$定义的对偶闵可夫斯基范数)的度量熵表征。这个等价与凸几何中存在已久的度量熵对偶猜想有一个有趣的联系。我们的样本复杂度表征暗示了度量熵对偶性的一种变体,我们证明了它几乎是紧密的。在无分布的情况下,我们关注Dwork等人考虑的情况,其中P$包含所有可能的预测因子,因此样本复杂性仅取决于D$。在这种情况下,我们证明了结果不可区分性的样本复杂性是由脂肪粉碎维度D表征的。我们还展示了在无分布和特定分布设置中可实现和不可知性结果不可区分性之间的强样本复杂性分离。这与非发行版(resp.)形成对比。分布特定的PAC学习,其中可实现和不可知性设置中的样本复杂性都可以通过VC维来表征。度量熵)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信