{"title":"Stahel–Donoho均值中值对对抗性腐败和重尾数据的稳健性","authors":"Jules Depersin;Guillaume Lecué","doi":"10.1093/imaiai/iaac026","DOIUrl":null,"url":null,"abstract":"We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the \n<tex>$L_2$</tex>\n case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the \n<tex>$\\sqrt{n}$</tex>\n-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"12 2","pages":"814-850"},"PeriodicalIF":1.4000,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means\",\"authors\":\"Jules Depersin;Guillaume Lecué\",\"doi\":\"10.1093/imaiai/iaac026\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the \\n<tex>$L_2$</tex>\\n case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the \\n<tex>$\\\\sqrt{n}$</tex>\\n-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.\",\"PeriodicalId\":45437,\"journal\":{\"name\":\"Information and Inference-A Journal of the Ima\",\"volume\":\"12 2\",\"pages\":\"814-850\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2022-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Inference-A Journal of the Ima\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10058610/\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://ieeexplore.ieee.org/document/10058610/","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
On the robustness to adversarial corruption and to heavy-tailed data of the Stahel–Donoho median of means
We consider median of means (MOM) versions of the Stahel–Donoho outlyingness (SDO) [23, 66] and of the Median Absolute Deviation (MAD) [30] functions to construct subgaussian estimators of a mean vector under adversarial contamination and heavy-tailed data. We develop a single analysis of the MOM version of the SDO which covers all cases ranging from the Gaussian case to the
$L_2$
case. It is based on isomorphic and almost isometric properties of the MOM versions of SDO and MAD. This analysis also covers cases where the mean does not even exist but a location parameter does; in those cases we still recover the same subgaussian rates and the same price for adversarial contamination even though there is not even a first moment. These properties are achieved by the classical SDO median and are therefore the first non-asymptotic statistical bounds on the Stahel–Donoho median complementing the
$\sqrt{n}$
-consistency [58] and asymptotic normality [74] of the Stahel–Donoho estimators. We also show that the MOM version of MAD can be used to construct an estimator of the covariance matrix only under the existence of a second moment or of a scatter matrix if a second moment does not exist.