{"title":"Minimax Rate-optimal Estimation of KL Divergence between Discrete Distributions.","authors":"Yanjun Han, Jiantao Jiao, Tsachy Weissman","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>We refine the general methodology in [1] for the construction and analysis of essentially minimax estimators for a wide class of functionals of finite dimensional parameters, and elaborate on the case of discrete distributions with support size <i>S</i> comparable with the number of observations <i>n</i>. Specifically, we determine the \"smooth\" and \"non-smooth\" regimes based on the confidence set and the smoothness of the functional. In the \"non-smooth\" regime, we apply an unbiased estimator for a \"suitable\" polynomial approximation of the functional. In the \"smooth\" regime, we construct a bias corrected version of the Maximum Likelihood Estimator (MLE) based on Taylor expansion. We apply the general methodology to the problem of estimating the KL divergence between two discrete distributions from empirical data. We construct a minimax rate-optimal estimator which is adaptive in the sense that it does not require the knowledge of the support size nor the upper bound on the likelihood ratio. Moreover, the performance of the optimal estimator with <i>n</i> samples is essentially that of the MLE with <i>n</i> ln <i>n</i> samples, i.e., the <i>effective sample size enlargement</i> phenomenon holds.</p>","PeriodicalId":92224,"journal":{"name":"International Symposium on Information Theory and its Applications. International Symposium on Information Theory and its Applications","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5812299/pdf/nihms910323.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Information Theory and its Applications. International Symposium on Information Theory and its Applications","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We refine the general methodology in [1] for the construction and analysis of essentially minimax estimators for a wide class of functionals of finite dimensional parameters, and elaborate on the case of discrete distributions with support size S comparable with the number of observations n. Specifically, we determine the "smooth" and "non-smooth" regimes based on the confidence set and the smoothness of the functional. In the "non-smooth" regime, we apply an unbiased estimator for a "suitable" polynomial approximation of the functional. In the "smooth" regime, we construct a bias corrected version of the Maximum Likelihood Estimator (MLE) based on Taylor expansion. We apply the general methodology to the problem of estimating the KL divergence between two discrete distributions from empirical data. We construct a minimax rate-optimal estimator which is adaptive in the sense that it does not require the knowledge of the support size nor the upper bound on the likelihood ratio. Moreover, the performance of the optimal estimator with n samples is essentially that of the MLE with n ln n samples, i.e., the effective sample size enlargement phenomenon holds.