基于单元格异常值数据的协方差矩阵估计的惩罚似然方法

IF 4.6 2区 工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Petre Stoica;Prabhu Babu
{"title":"基于单元格异常值数据的协方差矩阵估计的惩罚似然方法","authors":"Petre Stoica;Prabhu Babu","doi":"10.1109/TSP.2024.3507819","DOIUrl":null,"url":null,"abstract":"In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set \n<inline-formula><tex-math>$\\{\\mathbf{y}(t)\\}$</tex-math></inline-formula>\n that may contain outliers. If \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n is flagged as outlying by this approach, then the entire vector \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n. In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"5616-5627"},"PeriodicalIF":4.6000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Penalized Likelihood Approach to Covariance Matrix Estimation From Data With Cell Outliers\",\"authors\":\"Petre Stoica;Prabhu Babu\",\"doi\":\"10.1109/TSP.2024.3507819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set \\n<inline-formula><tex-math>$\\\\{\\\\mathbf{y}(t)\\\\}$</tex-math></inline-formula>\\n that may contain outliers. If \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n is flagged as outlying by this approach, then the entire vector \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n. In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.\",\"PeriodicalId\":13330,\"journal\":{\"name\":\"IEEE Transactions on Signal Processing\",\"volume\":\"72 \",\"pages\":\"5616-5627\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10771602/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10771602/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

在最近的一篇论文中,我们提出了一种从可能包含异常值的多元数据集$\{\mathbf{y}(t)\}$估计协方差矩阵的方法。如果通过这种方法将$\mathbf{y}(t)$标记为偏离,则整个向量$\mathbf{y}(t)$被认为不包含有用的信息并被丢弃。但是,在某些应用程序中,数据包含单元格离群值,也就是说,不是$\mathbf{y}(t)$的所有元素都是离群值,而只是其中的一些。然后只需要从数据中消除单元格异常值,而不是整个向量$\mathbf{y}(t)$。在本文中,我们提出了一种惩罚最大似然方法,用于从具有细胞异常值的数据中进行异常值检测和协方差矩阵估计。具体来说,对于给定的协方差矩阵估计,我们通过最大化数据的惩罚似然来估计数据集中异常值的位置,惩罚来自似然比的性质和错误发现率(FDR)原则。我们将此步骤与估计给定离群位置的协方差矩阵的最大化最小化(MM)技术交替进行。MM比期望最大化(EM)算法更灵活,通常用于从缺少单元格的数据中估计协方差矩阵,因为前者可以在后者不可用的情况下使用。与我们的方法最接近的竞争对手是cellMCD(最小协方差行列式)方法,与之相比,我们提出的方法在引言和数值研究部分中描述了许多优点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Penalized Likelihood Approach to Covariance Matrix Estimation From Data With Cell Outliers
In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set $\{\mathbf{y}(t)\}$ that may contain outliers. If $\mathbf{y}(t)$ is flagged as outlying by this approach, then the entire vector $\mathbf{y}(t)$ is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of $\mathbf{y}(t)$ are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector $\mathbf{y}(t)$ . In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IEEE Transactions on Signal Processing
IEEE Transactions on Signal Processing 工程技术-工程:电子与电气
CiteScore
11.20
自引率
9.30%
发文量
310
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信