{"title":"Nonparametric Kullback-Liebler Divergence Estimation Using M-Spacing","authors":"Linyun He, Eunhye Song","doi":"10.1109/WSC52266.2021.9715376","DOIUrl":null,"url":null,"abstract":"Entropy of a random variable with unknown distribution function can be estimated nonparametrically by spacing methods when independent and identically distributed (i.i.d.) observations of the random variable are available. We extend the classical entropy estimator based on sample spacing to define an m-spacing estimator for the Kullback-Liebler (KL) divergence between two i.i.d. observations with unknown distribution functions, which can be applied to measure discrepancy between real-world system output and simulation output as well as between two simulators' outputs. We show that the proposed estimator converges almost surely to the true KL divergence as the numbers of outputs collected from both systems increase under mild conditions and discuss the required choices for $m$ and the simulation output sample size as functions of the real-world sample size. Additionally, we show Central Limit Theorems for the proposed estimator with appropriate scaling.","PeriodicalId":369368,"journal":{"name":"2021 Winter Simulation Conference (WSC)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Winter Simulation Conference (WSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WSC52266.2021.9715376","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Entropy of a random variable with unknown distribution function can be estimated nonparametrically by spacing methods when independent and identically distributed (i.i.d.) observations of the random variable are available. We extend the classical entropy estimator based on sample spacing to define an m-spacing estimator for the Kullback-Liebler (KL) divergence between two i.i.d. observations with unknown distribution functions, which can be applied to measure discrepancy between real-world system output and simulation output as well as between two simulators' outputs. We show that the proposed estimator converges almost surely to the true KL divergence as the numbers of outputs collected from both systems increase under mild conditions and discuss the required choices for $m$ and the simulation output sample size as functions of the real-world sample size. Additionally, we show Central Limit Theorems for the proposed estimator with appropriate scaling.