{"title":"估计熵和互信息的记忆复杂度","authors":"Tomer Berg;Or Ordentlich;Ofer Shayevitz","doi":"10.1109/TIT.2025.3547871","DOIUrl":null,"url":null,"abstract":"We observe an infinite sequence of independent identically distributed random variables <inline-formula> <tex-math>$X_{1},X_{2},\\ldots $ </tex-math></inline-formula> drawn from an unknown distribution <italic>p</i> over <inline-formula> <tex-math>$[n]$ </tex-math></inline-formula>, and our goal is to estimate the entropy <inline-formula> <tex-math>$H(p)=-\\mathop {\\mathrm {\\mathbb {E}}}\\nolimits [\\log p(X)]$ </tex-math></inline-formula> within an <inline-formula> <tex-math>$\\varepsilon $ </tex-math></inline-formula>-additive error. To that end, at each time point we are allowed to update a finite-state machine with <italic>S</i> states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity <inline-formula> <tex-math>$S^{*}$ </tex-math></inline-formula> of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least <inline-formula> <tex-math>$1-\\delta $ </tex-math></inline-formula> asymptotically, uniformly in <italic>p</i>. Specifically, we show that there exist universal constants <inline-formula> <tex-math>$C_{1}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$C_{2}$ </tex-math></inline-formula> such that <inline-formula> <tex-math>$ S^{*} \\leq C_{1}\\cdot \\frac {n (\\log n)^{4}}{\\varepsilon ^{2}\\delta }$ </tex-math></inline-formula> for <inline-formula> <tex-math>$\\varepsilon $ </tex-math></inline-formula> not too small, and <inline-formula> <tex-math>$S^{*} \\geq C_{2} \\cdot \\max \\left \\{{{n, \\frac {\\log n}{\\varepsilon }}}\\right \\}$ </tex-math></inline-formula> for <inline-formula> <tex-math>$\\varepsilon $ </tex-math></inline-formula> not too large. The upper bound is proved using approximate counting to estimate the logarithm of <italic>p</i>, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 5","pages":"3334-3349"},"PeriodicalIF":2.2000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Memory Complexity of Estimating Entropy and Mutual Information\",\"authors\":\"Tomer Berg;Or Ordentlich;Ofer Shayevitz\",\"doi\":\"10.1109/TIT.2025.3547871\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We observe an infinite sequence of independent identically distributed random variables <inline-formula> <tex-math>$X_{1},X_{2},\\\\ldots $ </tex-math></inline-formula> drawn from an unknown distribution <italic>p</i> over <inline-formula> <tex-math>$[n]$ </tex-math></inline-formula>, and our goal is to estimate the entropy <inline-formula> <tex-math>$H(p)=-\\\\mathop {\\\\mathrm {\\\\mathbb {E}}}\\\\nolimits [\\\\log p(X)]$ </tex-math></inline-formula> within an <inline-formula> <tex-math>$\\\\varepsilon $ </tex-math></inline-formula>-additive error. To that end, at each time point we are allowed to update a finite-state machine with <italic>S</i> states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity <inline-formula> <tex-math>$S^{*}$ </tex-math></inline-formula> of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least <inline-formula> <tex-math>$1-\\\\delta $ </tex-math></inline-formula> asymptotically, uniformly in <italic>p</i>. Specifically, we show that there exist universal constants <inline-formula> <tex-math>$C_{1}$ </tex-math></inline-formula> and <inline-formula> <tex-math>$C_{2}$ </tex-math></inline-formula> such that <inline-formula> <tex-math>$ S^{*} \\\\leq C_{1}\\\\cdot \\\\frac {n (\\\\log n)^{4}}{\\\\varepsilon ^{2}\\\\delta }$ </tex-math></inline-formula> for <inline-formula> <tex-math>$\\\\varepsilon $ </tex-math></inline-formula> not too small, and <inline-formula> <tex-math>$S^{*} \\\\geq C_{2} \\\\cdot \\\\max \\\\left \\\\{{{n, \\\\frac {\\\\log n}{\\\\varepsilon }}}\\\\right \\\\}$ </tex-math></inline-formula> for <inline-formula> <tex-math>$\\\\varepsilon $ </tex-math></inline-formula> not too large. The upper bound is proved using approximate counting to estimate the logarithm of <italic>p</i>, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.\",\"PeriodicalId\":13494,\"journal\":{\"name\":\"IEEE Transactions on Information Theory\",\"volume\":\"71 5\",\"pages\":\"3334-3349\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Theory\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10909664/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10909664/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Memory Complexity of Estimating Entropy and Mutual Information
We observe an infinite sequence of independent identically distributed random variables $X_{1},X_{2},\ldots $ drawn from an unknown distribution p over $[n]$ , and our goal is to estimate the entropy $H(p)=-\mathop {\mathrm {\mathbb {E}}}\nolimits [\log p(X)]$ within an $\varepsilon $ -additive error. To that end, at each time point we are allowed to update a finite-state machine with S states, using a possibly randomized but time-invariant rule, where each state of the machine is assigned an entropy estimate. Our goal is to characterize the minimax memory complexity $S^{*}$ of this problem, which is the minimal number of states for which the estimation task is feasible with probability at least $1-\delta $ asymptotically, uniformly in p. Specifically, we show that there exist universal constants $C_{1}$ and $C_{2}$ such that $ S^{*} \leq C_{1}\cdot \frac {n (\log n)^{4}}{\varepsilon ^{2}\delta }$ for $\varepsilon $ not too small, and $S^{*} \geq C_{2} \cdot \max \left \{{{n, \frac {\log n}{\varepsilon }}}\right \}$ for $\varepsilon $ not too large. The upper bound is proved using approximate counting to estimate the logarithm of p, and a finite memory bias estimation machine to estimate the expectation operation. The lower bound is proved via a reduction of entropy estimation to uniformity testing. We also apply these results to derive bounds on the memory complexity of mutual information estimation.
期刊介绍:
The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.