Poly H. da Silva, Arash Jamshidpey, P. McCullagh, S. Tavaré
{"title":"Fisher’s measure of variability in repeated samples","authors":"Poly H. da Silva, Arash Jamshidpey, P. McCullagh, S. Tavaré","doi":"10.3150/22-bej1494","DOIUrl":null,"url":null,"abstract":"Fisher (1943) claimed that the expected value of the sample variance of the number of species found in large samples, each of n specimens taken from the same population, is asymptotically θ log2. This is at odds with the value θ log n obtained directly from the Ewens Sampling Formula (ESF), where θ specifies the rate at which new species are found. To resolve this apparent contradiction, we assume the species frequency spectrum in the population is determined by the ESF and that the samples are disjoint subsets drawn sequentially from this single population. We find an explicit formula for the required expected value for p samples of arbitrary size; in the limit of large equally-sized samples, it indeed has the value θ log2. We obtain limit theorems for the sample variance of p samples of size n under various limiting regimes as p , n or both tend to ∞ . We discuss further the behavior of the number of species present in all samples, and revisit Fisher’s log-series distribution as the limiting distribution of the number of specimens observed in typical species in a future, large sample.","PeriodicalId":55387,"journal":{"name":"Bernoulli","volume":" ","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bernoulli","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3150/22-bej1494","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 3
Abstract
Fisher (1943) claimed that the expected value of the sample variance of the number of species found in large samples, each of n specimens taken from the same population, is asymptotically θ log2. This is at odds with the value θ log n obtained directly from the Ewens Sampling Formula (ESF), where θ specifies the rate at which new species are found. To resolve this apparent contradiction, we assume the species frequency spectrum in the population is determined by the ESF and that the samples are disjoint subsets drawn sequentially from this single population. We find an explicit formula for the required expected value for p samples of arbitrary size; in the limit of large equally-sized samples, it indeed has the value θ log2. We obtain limit theorems for the sample variance of p samples of size n under various limiting regimes as p , n or both tend to ∞ . We discuss further the behavior of the number of species present in all samples, and revisit Fisher’s log-series distribution as the limiting distribution of the number of specimens observed in typical species in a future, large sample.
期刊介绍:
BERNOULLI is the journal of the Bernoulli Society for Mathematical Statistics and Probability, issued four times per year. The journal provides a comprehensive account of important developments in the fields of statistics and probability, offering an international forum for both theoretical and applied work.
BERNOULLI will publish:
Papers containing original and significant research contributions: with background, mathematical derivation and discussion of the results in suitable detail and, where appropriate, with discussion of interesting applications in relation to the methodology proposed.
Papers of the following two types will also be considered for publication, provided they are judged to enhance the dissemination of research:
Review papers which provide an integrated critical survey of some area of probability and statistics and discuss important recent developments.
Scholarly written papers on some historical significant aspect of statistics and probability.