A method to simulate multivariate outliers with known mahalanobis distances for normal and non-normal data

Q2 Psychology
Oscar L. Olvera Astivia
{"title":"A method to simulate multivariate outliers with known mahalanobis distances for normal and non-normal data","authors":"Oscar L. Olvera Astivia","doi":"10.1016/j.metip.2024.100157","DOIUrl":null,"url":null,"abstract":"<div><p>Monte Carlo simulations and theoretical analyses have repeatedly demonstrated the impact of outliers on statistical analysis. Most simulation studies generate outliers using one of two general approaches: by multiplying an arbitrary point by a constant or through a finite mixture. The latter can be extended to multivariate settings by defining the Mahalanobis distance between the centroids of two clusters of points. Nevertheless, when researchers aim to simulate individual data points with population-level Mahalanobis distances, the number of available procedures is very limited. This article generalizes one of the few existing methods to simulate an arbitrary number of outliers in an arbitrary number of dimensions, for both multivariate normal and non-normal data. A small simulation demonstration showcases how this methodology enables new simulation designs that were either unpopular or not possible due to the lack of a data-generating algorithm. A discussion of potential implications highlights the importance of considering multivariate outliers in simulation settings.</p></div>","PeriodicalId":93338,"journal":{"name":"Methods in Psychology (Online)","volume":"11 ","pages":"Article 100157"},"PeriodicalIF":0.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590260124000237/pdfft?md5=994109449d478d74e642895eea71d9ad&pid=1-s2.0-S2590260124000237-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Methods in Psychology (Online)","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590260124000237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Psychology","Score":null,"Total":0}
引用次数: 0

Abstract

Monte Carlo simulations and theoretical analyses have repeatedly demonstrated the impact of outliers on statistical analysis. Most simulation studies generate outliers using one of two general approaches: by multiplying an arbitrary point by a constant or through a finite mixture. The latter can be extended to multivariate settings by defining the Mahalanobis distance between the centroids of two clusters of points. Nevertheless, when researchers aim to simulate individual data points with population-level Mahalanobis distances, the number of available procedures is very limited. This article generalizes one of the few existing methods to simulate an arbitrary number of outliers in an arbitrary number of dimensions, for both multivariate normal and non-normal data. A small simulation demonstration showcases how this methodology enables new simulation designs that were either unpopular or not possible due to the lack of a data-generating algorithm. A discussion of potential implications highlights the importance of considering multivariate outliers in simulation settings.

用已知的 mahalanobis 距离模拟正态和非正态数据的多元离群值的方法
蒙特卡罗模拟和理论分析一再证明了异常值对统计分析的影响。大多数模拟研究使用两种一般方法之一生成异常值:将任意点乘以常数或通过有限混合物。后者可通过定义两个点群中心点之间的马哈拉诺比距离扩展到多元设置。然而,当研究人员希望用群体水平的 Mahalanobis 距离模拟单个数据点时,可用程序的数量非常有限。本文将现有的为数不多的方法之一加以推广,以便在任意维度上模拟任意数量的离群值,既适用于多元正态数据,也适用于非正态数据。一个小型模拟演示展示了这种方法是如何实现新的模拟设计的,这些设计要么不受欢迎,要么由于缺乏数据生成算法而无法实现。对潜在影响的讨论强调了在模拟设置中考虑多元离群值的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Methods in Psychology (Online)
Methods in Psychology (Online) Experimental and Cognitive Psychology, Clinical Psychology, Developmental and Educational Psychology
CiteScore
5.50
自引率
0.00%
发文量
0
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信