Histogram lies about distribution shape and Pearson's coefficient of variation lies about relative variability

IF 1.3
P. Silveira, J. O. Siqueira
{"title":"Histogram lies about distribution shape and Pearson's coefficient of variation lies about relative variability","authors":"P. Silveira, J. O. Siqueira","doi":"10.20982/tqmp.18.1.p091","DOIUrl":null,"url":null,"abstract":"aDepartment of Pathology (LIM01-HCFMUSP), Medical School, University of Sao Paulo, SP, Brazil bDepartment of Legal Medicine, Medical Ethics, Work and Social Medicine, Medical School, University of Sao Paulo, SP, Brazil Abstract Histograms and Pearson’s coefficient of variation are among the most popular summary statistics. Researchers use histograms to judge the shape of quantitative data distribution by visual inspection. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer’s relative dispersion coefficient. Hypothetical examples developed in R are applied to create histograms and density plots, and to compute coefficient of variation and relative dispersion coefficient. These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms do not necessarily reflect the distribution of probabilities and the Pearson’s coefficient of variation is not invariant with linear transformations and is not a measure of relative variability, for it is a ratio between a measure of absolute variability (standard deviation) and a measure of central position (mean). Potential alternatives are explained and applied for contrast. With the use of modern computers and R language it is easy to apply density plots, which are able to approximate the theoretical probability distribution. In addition, Eisenhauer’s relative dispersion coefficient is suggested as a suitable estimator of relative variability, including sample size correction for lower and upper bounds.","PeriodicalId":93055,"journal":{"name":"The quantitative methods for psychology","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The quantitative methods for psychology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20982/tqmp.18.1.p091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

aDepartment of Pathology (LIM01-HCFMUSP), Medical School, University of Sao Paulo, SP, Brazil bDepartment of Legal Medicine, Medical Ethics, Work and Social Medicine, Medical School, University of Sao Paulo, SP, Brazil Abstract Histograms and Pearson’s coefficient of variation are among the most popular summary statistics. Researchers use histograms to judge the shape of quantitative data distribution by visual inspection. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer’s relative dispersion coefficient. Hypothetical examples developed in R are applied to create histograms and density plots, and to compute coefficient of variation and relative dispersion coefficient. These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms do not necessarily reflect the distribution of probabilities and the Pearson’s coefficient of variation is not invariant with linear transformations and is not a measure of relative variability, for it is a ratio between a measure of absolute variability (standard deviation) and a measure of central position (mean). Potential alternatives are explained and applied for contrast. With the use of modern computers and R language it is easy to apply density plots, which are able to approximate the theoretical probability distribution. In addition, Eisenhauer’s relative dispersion coefficient is suggested as a suitable estimator of relative variability, including sample size correction for lower and upper bounds.
直方图与分布形状有关,皮尔逊变异系数与相对变异性有关
a巴西圣保罗大学医学院病理学系(LIM01-HCFMUSP) b巴西圣保罗大学医学院法律医学系、医学伦理、工作与社会医学系摘要直方图和Pearson变异系数是最常用的汇总统计方法。研究者使用直方图通过目测来判断定量数据分布的形状。变异系数作为这些数据的相对变异的估计量。我们通过R中的例子来探索直方图和变异系数的性质,从而提供了更好的选择:密度图和艾森豪尔的相对分散系数。使用R开发的假设示例创建直方图和密度图,并计算变异系数和相对分散系数。这些假设的例子清楚地表明,这两种传统方法是有缺陷的。直方图不一定反映概率的分布,皮尔逊变异系数在线性变换中不是不变的,也不是相对变异性的度量,因为它是绝对变异性度量(标准差)和中心位置度量(平均值)之间的比率。对可能的替代方案进行了解释,并应用于对比。利用现代计算机和R语言,可以很容易地应用密度图,它可以近似理论概率分布。此外,Eisenhauer的相对色散系数被认为是相对变异性的一个合适的估计量,包括对下界和上界的样本量校正。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信