Poly H. da Silva, Arash Jamshidpey, P. McCullagh, S. Tavaré
{"title":"重复样本变异性的Fisher测度","authors":"Poly H. da Silva, Arash Jamshidpey, P. McCullagh, S. Tavaré","doi":"10.3150/22-bej1494","DOIUrl":null,"url":null,"abstract":"Fisher (1943) claimed that the expected value of the sample variance of the number of species found in large samples, each of n specimens taken from the same population, is asymptotically θ log2. This is at odds with the value θ log n obtained directly from the Ewens Sampling Formula (ESF), where θ specifies the rate at which new species are found. To resolve this apparent contradiction, we assume the species frequency spectrum in the population is determined by the ESF and that the samples are disjoint subsets drawn sequentially from this single population. We find an explicit formula for the required expected value for p samples of arbitrary size; in the limit of large equally-sized samples, it indeed has the value θ log2. We obtain limit theorems for the sample variance of p samples of size n under various limiting regimes as p , n or both tend to ∞ . We discuss further the behavior of the number of species present in all samples, and revisit Fisher’s log-series distribution as the limiting distribution of the number of specimens observed in typical species in a future, large sample.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Fisher’s measure of variability in repeated samples\",\"authors\":\"Poly H. da Silva, Arash Jamshidpey, P. McCullagh, S. Tavaré\",\"doi\":\"10.3150/22-bej1494\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fisher (1943) claimed that the expected value of the sample variance of the number of species found in large samples, each of n specimens taken from the same population, is asymptotically θ log2. This is at odds with the value θ log n obtained directly from the Ewens Sampling Formula (ESF), where θ specifies the rate at which new species are found. To resolve this apparent contradiction, we assume the species frequency spectrum in the population is determined by the ESF and that the samples are disjoint subsets drawn sequentially from this single population. We find an explicit formula for the required expected value for p samples of arbitrary size; in the limit of large equally-sized samples, it indeed has the value θ log2. We obtain limit theorems for the sample variance of p samples of size n under various limiting regimes as p , n or both tend to ∞ . We discuss further the behavior of the number of species present in all samples, and revisit Fisher’s log-series distribution as the limiting distribution of the number of specimens observed in typical species in a future, large sample.\",\"PeriodicalId\":1,\"journal\":{\"name\":\"Accounts of Chemical Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":16.4000,\"publicationDate\":\"2023-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accounts of Chemical Research\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.3150/22-bej1494\",\"RegionNum\":1,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.3150/22-bej1494","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
Fisher’s measure of variability in repeated samples
Fisher (1943) claimed that the expected value of the sample variance of the number of species found in large samples, each of n specimens taken from the same population, is asymptotically θ log2. This is at odds with the value θ log n obtained directly from the Ewens Sampling Formula (ESF), where θ specifies the rate at which new species are found. To resolve this apparent contradiction, we assume the species frequency spectrum in the population is determined by the ESF and that the samples are disjoint subsets drawn sequentially from this single population. We find an explicit formula for the required expected value for p samples of arbitrary size; in the limit of large equally-sized samples, it indeed has the value θ log2. We obtain limit theorems for the sample variance of p samples of size n under various limiting regimes as p , n or both tend to ∞ . We discuss further the behavior of the number of species present in all samples, and revisit Fisher’s log-series distribution as the limiting distribution of the number of specimens observed in typical species in a future, large sample.
期刊介绍:
Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance.
Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.