{"title":"The logarithmic Zipf law in a general urn problem","authors":"Aristides V. Doumas, V. Papanicolaou","doi":"10.1051/ps/2020011","DOIUrl":null,"url":null,"abstract":"The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by p j , j = 1, 2, …, N . Let T m (N ) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of T m (N ) (leading term) as N →∞ , when p j = a j / ∑j =2 N +1 a j , where a j = (ln j )−p , p > 0. \\begin{equation*} p_{j}=\\frac{a_{j}}{\\sum_{j=2}^{N+1} a_{j}}, \\,\\,\\,\\text{where}\\,\\,\\, a_{j}=\\left(\\ln j\\right)^{-p}, \\,\\,p>0.\\end{equation*} pj=aj ∑ j=2N+1aj,whereaj= lnj-p,p>0. Moreover, we prove that T m (N ) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1051/ps/2020011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The origin of power-law behavior (also known variously as Zipf’s law) has been a topic of debate in the scientific community for more than a century. Power laws appear widely in physics, biology, earth and planetary sciences, economics and finance, computer science, demography and the social sciences. In a highly cited article, Mark Newman [Contemp. Phys. 46 (2005) 323–351] reviewed some of the empirical evidence for the existence of power-law forms, however underscored that even though many distributions do not follow a power law, quite often many of the quantities that scientists measure are close to a Zipf law, and hence are of importance. In this paper we engage a variant of Zipf’s law with a general urn problem. A collector wishes to collect m complete sets of N distinct coupons. The draws from the population are considered to be independent and identically distributed with replacement, and the probability that a type-j coupon is drawn is denoted by p j , j = 1, 2, …, N . Let T m (N ) the number of trials needed for this problem. We present the asymptotics for the expectation (five terms plus an error), the second rising moment (six terms plus an error), and the variance of T m (N ) (leading term) as N →∞ , when p j = a j / ∑j =2 N +1 a j , where a j = (ln j )−p , p > 0. \begin{equation*} p_{j}=\frac{a_{j}}{\sum_{j=2}^{N+1} a_{j}}, \,\,\,\text{where}\,\,\, a_{j}=\left(\ln j\right)^{-p}, \,\,p>0.\end{equation*} pj=aj ∑ j=2N+1aj,whereaj= lnj-p,p>0. Moreover, we prove that T m (N ) (appropriately normalized) converges in distribution to a Gumbel random variable. These “log-Zipf” classes of coupon probabilities are not covered by the existing literature and the present paper comes to fill this gap. In the spirit of a recent paper of ours [ESAIM: PS 20 (2016) 367–399] we enlarge the classes for which the Dixie cup problem is solved w.r.t. its moments, variance, distribution.