Investigating the ecological fallacy through sampling distributions constructed from finite populations

Pub Date : 2024-08-08 DOI:10.1515/mcma-2024-2013
David J. Torres, Damain Rouson
{"title":"Investigating the ecological fallacy through sampling distributions constructed from finite populations","authors":"David J. Torres, Damain Rouson","doi":"10.1515/mcma-2024-2013","DOIUrl":null,"url":null,"abstract":"\n Correlation coefficients\nand linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution.\nThe sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions.\nHowever, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the\nsampling distributions for\ncorrelation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.","PeriodicalId":0,"journal":{"name":"","volume":"26 22","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/mcma-2024-2013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Correlation coefficients and linear regression values computed from group averages can differ from correlation coefficients and linear regression values computed using individual scores. This observation known as the ecological fallacy often assumes that all the individual scores are available from a population. In many situations, one must use a sample from the larger population. In such cases, the computed correlation coefficient and linear regression values will depend on the sample that is chosen and the underlying sampling distribution. The sampling distribution of correlation coefficients and linear regression values for group averages will be identical to the sampling distribution for individuals for normally distributed variables for random samples drawn from infinitely large continuous distributions. However, data that is acquired in practice is often acquired when sampling without replacement from a finite population. Our objective is to demonstrate through Monte Carlo simulations that the sampling distributions for correlation and linear regression will also be similar for individuals and group averages when sampling without replacement from normally distributed variables. These simulations suggest that when a random sample from a population is selected, the correlation coefficients and linear regression values computed from individual scores will not be more accurate in estimating the entire population values compared to samples when group averages are used as long as the sample size is the same.
分享
查看原文
通过有限种群构建的抽样分布研究生态谬误
根据群体平均值计算的相关系数和线性回归值可能与根据个体得分计算的相关系数和线性回归值不同。这种被称为 "生态谬误 "的观点通常假定可以从群体中获得所有的个体分数。在很多情况下,我们必须从更大的群体中抽取样本。在这种情况下,计算出的相关系数和线性回归值将取决于所选择的样本和基本的抽样分布。对于从无限大连续分布中抽取的随机样本,群体平均值的相关系数和线性回归值的抽样分布将与正态分布变量的个体抽样分布相同。我们的目的是通过蒙特卡罗模拟证明,从正态分布变量中进行不替换抽样时,个体和群体平均值的相关性和线性回归的抽样分布也是相似的。这些模拟结果表明,当从群体中随机抽样时,只要样本量相同,根据个体得分计算出的相关系数和线性回归值与使用群体平均值的样本相比,在估计整个群体的数值时不会更准确。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信