{"title":"利用日本公司数据对中心极限定理样本大小的实证研究","authors":"Kosei Fukuda","doi":"10.1111/test.12378","DOIUrl":null,"url":null,"abstract":"In statistics classes, the central limit theorem has been demonstrated using simulation‐based illustrations. Known population distributions such as a uniform or exponential distribution are often used to consider the behavior of the sample mean in simulated samples. Unlike such simulations, a number of real‐data‐based simulations are here implemented in which the populations are empirical distributions of data selected from Japanese firms. The dataset chosen contains 38 variables familiar to business students, such as sales and assets. The maximum population size of the variables is 2243. One thousand samples with replacement are selected for specific variable–sample size combinations. Hypothesis testing results indicate that the normality hypothesis for the sample mean is rejected for 31 variables at the 0.1% level even with a sample size of 500. It is emphasized that the data for these variables indicate that this should not be a surprise, and emphasize the importance of looking at data.","PeriodicalId":43739,"journal":{"name":"Teaching Statistics","volume":"51 1","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An empirical study on sample size for the central limit theorem using Japanese firm data\",\"authors\":\"Kosei Fukuda\",\"doi\":\"10.1111/test.12378\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In statistics classes, the central limit theorem has been demonstrated using simulation‐based illustrations. Known population distributions such as a uniform or exponential distribution are often used to consider the behavior of the sample mean in simulated samples. Unlike such simulations, a number of real‐data‐based simulations are here implemented in which the populations are empirical distributions of data selected from Japanese firms. The dataset chosen contains 38 variables familiar to business students, such as sales and assets. The maximum population size of the variables is 2243. One thousand samples with replacement are selected for specific variable–sample size combinations. Hypothesis testing results indicate that the normality hypothesis for the sample mean is rejected for 31 variables at the 0.1% level even with a sample size of 500. It is emphasized that the data for these variables indicate that this should not be a surprise, and emphasize the importance of looking at data.\",\"PeriodicalId\":43739,\"journal\":{\"name\":\"Teaching Statistics\",\"volume\":\"51 1\",\"pages\":\"\"},\"PeriodicalIF\":1.2000,\"publicationDate\":\"2024-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Teaching Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1111/test.12378\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Teaching Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/test.12378","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
An empirical study on sample size for the central limit theorem using Japanese firm data
In statistics classes, the central limit theorem has been demonstrated using simulation‐based illustrations. Known population distributions such as a uniform or exponential distribution are often used to consider the behavior of the sample mean in simulated samples. Unlike such simulations, a number of real‐data‐based simulations are here implemented in which the populations are empirical distributions of data selected from Japanese firms. The dataset chosen contains 38 variables familiar to business students, such as sales and assets. The maximum population size of the variables is 2243. One thousand samples with replacement are selected for specific variable–sample size combinations. Hypothesis testing results indicate that the normality hypothesis for the sample mean is rejected for 31 variables at the 0.1% level even with a sample size of 500. It is emphasized that the data for these variables indicate that this should not be a surprise, and emphasize the importance of looking at data.