Improvements to Sanov and PAC Sublevel-set Bounds for Discrete Random Variables

M. A. Tope, Joel M. Morris
{"title":"Improvements to Sanov and PAC Sublevel-set Bounds for Discrete Random Variables","authors":"M. A. Tope, Joel M. Morris","doi":"10.1109/CISS50987.2021.9400225","DOIUrl":null,"url":null,"abstract":"We derive an improvement for probably approximately correct (PAC) sublevel-set bounds for the multinomial distributed discrete random variables. Previous bounds (including Sanov's Theorem) show that the Kullback Leibler (KL) divergence between the empirical probability mass function (pmf) and the true PMF converges with rate O(log(N)/N), where $N$ is the number of independent and identically distributed (i.i.d.) samples used to compute the empirical pmf. We interpret the KL divergence as bounding the probability that a multinomial distributed random variable (RV) deviates into a halfspace and construct improved uniform PAC sublevel-set bounds that converge with rates $O$(log (log (N)) / N). These results bound the worst case performance for a number of machine learning algorithms. Finally, the ‘halfspace bound’ methodology suggests further improvements are possible for non-uniform bounds. In this paper, we derive an improvement (on the convergence rate) for various Probably Approximately Correct (PAC) bounds (including Sanov's Theorem) for multinomially distributed discrete random variables.","PeriodicalId":228112,"journal":{"name":"2021 55th Annual Conference on Information Sciences and Systems (CISS)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 55th Annual Conference on Information Sciences and Systems (CISS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CISS50987.2021.9400225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We derive an improvement for probably approximately correct (PAC) sublevel-set bounds for the multinomial distributed discrete random variables. Previous bounds (including Sanov's Theorem) show that the Kullback Leibler (KL) divergence between the empirical probability mass function (pmf) and the true PMF converges with rate O(log(N)/N), where $N$ is the number of independent and identically distributed (i.i.d.) samples used to compute the empirical pmf. We interpret the KL divergence as bounding the probability that a multinomial distributed random variable (RV) deviates into a halfspace and construct improved uniform PAC sublevel-set bounds that converge with rates $O$(log (log (N)) / N). These results bound the worst case performance for a number of machine learning algorithms. Finally, the ‘halfspace bound’ methodology suggests further improvements are possible for non-uniform bounds. In this paper, we derive an improvement (on the convergence rate) for various Probably Approximately Correct (PAC) bounds (including Sanov's Theorem) for multinomially distributed discrete random variables.
离散随机变量的Sanov和PAC子水平集边界的改进
给出了多项式分布离散随机变量的可能近似正确子水平集界的一种改进。以前的界限(包括Sanov定理)表明经验概率质量函数(pmf)和真实pmf之间的Kullback Leibler (KL)散度以速率O(log(N)/N)收敛,其中$N$是用于计算经验pmf的独立同分布(i.i.d)样本的数量。我们将KL散度解释为多项式分布随机变量(RV)偏离半空间的概率边界,并构建了改进的统一PAC子水平集边界,该边界收敛于速率$O$(log (log (N)) / N)。这些结果限定了许多机器学习算法的最差情况性能。最后,“半空间边界”方法表明,非均匀边界的进一步改进是可能的。本文对多项分布离散随机变量的各种可能近似正确(PAC)界(包括Sanov定理)给出了一种改进的收敛速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信