{"title":"基于单元格异常值数据的协方差矩阵估计的惩罚似然方法","authors":"Petre Stoica;Prabhu Babu","doi":"10.1109/TSP.2024.3507819","DOIUrl":null,"url":null,"abstract":"In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set \n<inline-formula><tex-math>$\\{\\mathbf{y}(t)\\}$</tex-math></inline-formula>\n that may contain outliers. If \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n is flagged as outlying by this approach, then the entire vector \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector \n<inline-formula><tex-math>$\\mathbf{y}(t)$</tex-math></inline-formula>\n. In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.","PeriodicalId":13330,"journal":{"name":"IEEE Transactions on Signal Processing","volume":"72 ","pages":"5616-5627"},"PeriodicalIF":4.6000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Penalized Likelihood Approach to Covariance Matrix Estimation From Data With Cell Outliers\",\"authors\":\"Petre Stoica;Prabhu Babu\",\"doi\":\"10.1109/TSP.2024.3507819\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set \\n<inline-formula><tex-math>$\\\\{\\\\mathbf{y}(t)\\\\}$</tex-math></inline-formula>\\n that may contain outliers. If \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n is flagged as outlying by this approach, then the entire vector \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector \\n<inline-formula><tex-math>$\\\\mathbf{y}(t)$</tex-math></inline-formula>\\n. In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.\",\"PeriodicalId\":13330,\"journal\":{\"name\":\"IEEE Transactions on Signal Processing\",\"volume\":\"72 \",\"pages\":\"5616-5627\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10771602/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10771602/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Penalized Likelihood Approach to Covariance Matrix Estimation From Data With Cell Outliers
In a recent paper we have proposed an approach for estimating the covariance matrix from a multivariate data set
$\{\mathbf{y}(t)\}$
that may contain outliers. If
$\mathbf{y}(t)$
is flagged as outlying by this approach, then the entire vector
$\mathbf{y}(t)$
is considered to contain no useful information and it is discarded. However, in some applications the data contains cell outliers, that is to say, not all elements of
$\mathbf{y}(t)$
are outlying but only some of them. One then wants to eliminate only the cell outliers from the data, rather than the entire vector
$\mathbf{y}(t)$
. In this paper, we propose a penalized maximum likelihood approach to outlier detection and covariance matrix estimation from data with cell outliers. Specifically we estimate the positions of the outliers in the data set, for a given estimate of the covariance matrix, by maximizing the penalized likelihood of the data with the penalty being derived from a property of the likelihood ratio and the false discovery rate (FDR) principle. We alternate this step with a majorization-minimization (MM) technique that estimates the covariance matrix for given outlier positions. The MM is more flexible than the expectation maximization (EM) algorithm commonly used for estimating the covariance matrix from data with missing cells, as the former can be utilized in cases in which the latter is not usable. The closest competitor of our approach is the cellMCD (minimum covariance determinant) method, compared with which the proposed approach has a number of advantages described in the introduction and the numerical study section.
期刊介绍:
The IEEE Transactions on Signal Processing covers novel theory, algorithms, performance analyses and applications of techniques for the processing, understanding, learning, retrieval, mining, and extraction of information from signals. The term “signal” includes, among others, audio, video, speech, image, communication, geophysical, sonar, radar, medical and musical signals. Examples of topics of interest include, but are not limited to, information processing and the theory and application of filtering, coding, transmitting, estimating, detecting, analyzing, recognizing, synthesizing, recording, and reproducing signals.