{"title":"多变量数据中签名异常值的鲁棒检测及其在自闭症风险早期识别中的应用。","authors":"Jesus E Delgado,Jed T Elison,Nathaniel E Helwig","doi":"10.1037/met0000775","DOIUrl":null,"url":null,"abstract":"This article proposes an approach for detecting multivariate outliers that combines robust estimation methods with signed detection information. Our method uses the Mahalanobis distance to quantify each observation's extremeness from the expected value relative to the covariance matrix, and we leverage robust estimation tools, i.e., the minimum covariance determinant, to estimate the mean vector and covariance matrix used in the Mahalanobis distance calculation. Furthermore, we incorporate a signing element into the distance calculation to give researchers greater control over the specific regions of multivariate space that should be prioritized when searching for outliers, which allows for more targeted risk assessment and classification. Lastly, we unify the robust and signed elements into a framework that can be used within bilinear models such as principal components analysis and factor analysis. Using simulated and real data examples, we demonstrate that the proposed approach can result in improved risk assessment and outlier detection, particularly when the sample is contaminated with a moderate-to-large number of outliers that have noteworthy contamination strengths. Overall, our results show that making use of a robust method when assessing multivariate risk leads to more accurate estimates, particularly when combined with relevant signing information. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"26 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust detection of signed outliers in multivariate data with applications to early identification of risk for autism.\",\"authors\":\"Jesus E Delgado,Jed T Elison,Nathaniel E Helwig\",\"doi\":\"10.1037/met0000775\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article proposes an approach for detecting multivariate outliers that combines robust estimation methods with signed detection information. Our method uses the Mahalanobis distance to quantify each observation's extremeness from the expected value relative to the covariance matrix, and we leverage robust estimation tools, i.e., the minimum covariance determinant, to estimate the mean vector and covariance matrix used in the Mahalanobis distance calculation. Furthermore, we incorporate a signing element into the distance calculation to give researchers greater control over the specific regions of multivariate space that should be prioritized when searching for outliers, which allows for more targeted risk assessment and classification. Lastly, we unify the robust and signed elements into a framework that can be used within bilinear models such as principal components analysis and factor analysis. Using simulated and real data examples, we demonstrate that the proposed approach can result in improved risk assessment and outlier detection, particularly when the sample is contaminated with a moderate-to-large number of outliers that have noteworthy contamination strengths. Overall, our results show that making use of a robust method when assessing multivariate risk leads to more accurate estimates, particularly when combined with relevant signing information. (PsycInfo Database Record (c) 2025 APA, all rights reserved).\",\"PeriodicalId\":20782,\"journal\":{\"name\":\"Psychological methods\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Psychological methods\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1037/met0000775\",\"RegionNum\":1,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Psychological methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1037/met0000775","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
摘要
本文提出了一种检测多变量异常值的方法,该方法结合了鲁棒估计方法和签名检测信息。我们的方法使用马氏距离从相对于协方差矩阵的期望值来量化每个观测值的极值,并且我们利用鲁棒估计工具,即最小协方差行列式,来估计马氏距离计算中使用的平均向量和协方差矩阵。此外,我们在距离计算中加入了一个签名元素,使研究人员能够更好地控制在搜索异常值时应该优先考虑的多元空间的特定区域,从而允许更有针对性的风险评估和分类。最后,我们将鲁棒元素和签名元素统一到一个框架中,该框架可用于双线性模型,如主成分分析和因子分析。通过模拟和真实数据示例,我们证明了所提出的方法可以改进风险评估和异常值检测,特别是当样本被具有显著污染强度的中等到大量异常值污染时。总体而言,我们的研究结果表明,在评估多变量风险时使用稳健的方法可以获得更准确的估计,特别是在与相关签名信息相结合时。(PsycInfo Database Record (c) 2025 APA,版权所有)。
Robust detection of signed outliers in multivariate data with applications to early identification of risk for autism.
This article proposes an approach for detecting multivariate outliers that combines robust estimation methods with signed detection information. Our method uses the Mahalanobis distance to quantify each observation's extremeness from the expected value relative to the covariance matrix, and we leverage robust estimation tools, i.e., the minimum covariance determinant, to estimate the mean vector and covariance matrix used in the Mahalanobis distance calculation. Furthermore, we incorporate a signing element into the distance calculation to give researchers greater control over the specific regions of multivariate space that should be prioritized when searching for outliers, which allows for more targeted risk assessment and classification. Lastly, we unify the robust and signed elements into a framework that can be used within bilinear models such as principal components analysis and factor analysis. Using simulated and real data examples, we demonstrate that the proposed approach can result in improved risk assessment and outlier detection, particularly when the sample is contaminated with a moderate-to-large number of outliers that have noteworthy contamination strengths. Overall, our results show that making use of a robust method when assessing multivariate risk leads to more accurate estimates, particularly when combined with relevant signing information. (PsycInfo Database Record (c) 2025 APA, all rights reserved).
期刊介绍:
Psychological Methods is devoted to the development and dissemination of methods for collecting, analyzing, understanding, and interpreting psychological data. Its purpose is the dissemination of innovations in research design, measurement, methodology, and quantitative and qualitative analysis to the psychological community; its further purpose is to promote effective communication about related substantive and methodological issues. The audience is expected to be diverse and to include those who develop new procedures, those who are responsible for undergraduate and graduate training in design, measurement, and statistics, as well as those who employ those procedures in research.