{"title":"关于缺失质量的亚高斯浓度","authors":"M. Skorski","doi":"10.1137/s0040585x97t991453","DOIUrl":null,"url":null,"abstract":"The statistical inference on missing mass aims to estimate the weight of elements not observed during sampling. Since the pioneer work of Good and Turing, the problem has been studied in many areas, including statistical linguistics, ecology, and machine learning. Proving the sub-Gaussian behavior of the missing mass has been notoriously hard, and a number of complicated arguments have been proposed: logarithmic Sobolev inequalities, thermodynamic approaches, and information-theoretic transportation methods. Prior works have argued that the difficulty is inherent, and classical tools are inadequate. We show that this common belief is false, and all that we need to establish the sub-Gaussian concentration is the classical inequality of Bernstein. The strong educational value of our work is in its demonstration of this inequality in its full generality, an aspect not well recognized by researchers.","PeriodicalId":51193,"journal":{"name":"Theory of Probability and its Applications","volume":"18 1","pages":"0"},"PeriodicalIF":0.5000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Sub-Gaussian Concentration of Missing Mass\",\"authors\":\"M. Skorski\",\"doi\":\"10.1137/s0040585x97t991453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The statistical inference on missing mass aims to estimate the weight of elements not observed during sampling. Since the pioneer work of Good and Turing, the problem has been studied in many areas, including statistical linguistics, ecology, and machine learning. Proving the sub-Gaussian behavior of the missing mass has been notoriously hard, and a number of complicated arguments have been proposed: logarithmic Sobolev inequalities, thermodynamic approaches, and information-theoretic transportation methods. Prior works have argued that the difficulty is inherent, and classical tools are inadequate. We show that this common belief is false, and all that we need to establish the sub-Gaussian concentration is the classical inequality of Bernstein. The strong educational value of our work is in its demonstration of this inequality in its full generality, an aspect not well recognized by researchers.\",\"PeriodicalId\":51193,\"journal\":{\"name\":\"Theory of Probability and its Applications\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.5000,\"publicationDate\":\"2023-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Theory of Probability and its Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/s0040585x97t991453\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theory of Probability and its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/s0040585x97t991453","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
The statistical inference on missing mass aims to estimate the weight of elements not observed during sampling. Since the pioneer work of Good and Turing, the problem has been studied in many areas, including statistical linguistics, ecology, and machine learning. Proving the sub-Gaussian behavior of the missing mass has been notoriously hard, and a number of complicated arguments have been proposed: logarithmic Sobolev inequalities, thermodynamic approaches, and information-theoretic transportation methods. Prior works have argued that the difficulty is inherent, and classical tools are inadequate. We show that this common belief is false, and all that we need to establish the sub-Gaussian concentration is the classical inequality of Bernstein. The strong educational value of our work is in its demonstration of this inequality in its full generality, an aspect not well recognized by researchers.
期刊介绍:
Theory of Probability and Its Applications (TVP) accepts original articles and communications on the theory of probability, general problems of mathematical statistics, and applications of the theory of probability to natural science and technology. Articles of the latter type will be accepted only if the mathematical methods applied are essentially new.