{"title":"普遍性约束下的序贯离群假设检验","authors":"Jun Diao;Lin Zhou","doi":"10.1109/TIT.2025.3583065","DOIUrl":null,"url":null,"abstract":"We revisit sequential outlier hypothesis testing and derive bounds on achievable exponents when both the nominal and anomalous distributions are <italic>unknown</i>. The task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where the rest majority are generated from a nominal distribution. In the sequential setting, one obtains a symbol from each sequence per unit time until a reliable decision could be made. For the case with exactly one outlier, our exponent bounds are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li et al. (2017). In particular, the average sample size of our sequential test is bounded universally under any pair of nominal and anomalous distributions and our sequential test achieves larger Bayesian exponent than the fixed-length test, which could not be guaranteed by the sequential test of Li et al. (2017). For the case with at most one outlier, we propose a threshold-based test that has bounded expected stopping time under mild conditions and we bound the exponential decay rate of error probabilities, a.k.a., error exponents, under each non-null hypothesis and the null hypothesis. Our sequential test resolves the tradeoff among the exponential decay rates of misclassification, false reject and false alarm probabilities for the fixed-length test of Zhou et al. (2022). Finally, with a further step towards practical applications, we generalize our results to the cases of multiple outliers and show that there is a penalty in the error exponents when the number of outliers is unknown.","PeriodicalId":13494,"journal":{"name":"IEEE Transactions on Information Theory","volume":"71 9","pages":"6602-6625"},"PeriodicalIF":2.9000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequential Outlier Hypothesis Testing Under Universality Constraints\",\"authors\":\"Jun Diao;Lin Zhou\",\"doi\":\"10.1109/TIT.2025.3583065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We revisit sequential outlier hypothesis testing and derive bounds on achievable exponents when both the nominal and anomalous distributions are <italic>unknown</i>. The task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where the rest majority are generated from a nominal distribution. In the sequential setting, one obtains a symbol from each sequence per unit time until a reliable decision could be made. For the case with exactly one outlier, our exponent bounds are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li et al. (2017). In particular, the average sample size of our sequential test is bounded universally under any pair of nominal and anomalous distributions and our sequential test achieves larger Bayesian exponent than the fixed-length test, which could not be guaranteed by the sequential test of Li et al. (2017). For the case with at most one outlier, we propose a threshold-based test that has bounded expected stopping time under mild conditions and we bound the exponential decay rate of error probabilities, a.k.a., error exponents, under each non-null hypothesis and the null hypothesis. Our sequential test resolves the tradeoff among the exponential decay rates of misclassification, false reject and false alarm probabilities for the fixed-length test of Zhou et al. (2022). Finally, with a further step towards practical applications, we generalize our results to the cases of multiple outliers and show that there is a penalty in the error exponents when the number of outliers is unknown.\",\"PeriodicalId\":13494,\"journal\":{\"name\":\"IEEE Transactions on Information Theory\",\"volume\":\"71 9\",\"pages\":\"6602-6625\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Theory\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11051023/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Theory","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11051023/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Sequential Outlier Hypothesis Testing Under Universality Constraints
We revisit sequential outlier hypothesis testing and derive bounds on achievable exponents when both the nominal and anomalous distributions are unknown. The task of outlier hypothesis testing is to identify the set of outliers that are generated from an anomalous distribution among all observed sequences where the rest majority are generated from a nominal distribution. In the sequential setting, one obtains a symbol from each sequence per unit time until a reliable decision could be made. For the case with exactly one outlier, our exponent bounds are tight, providing exact large deviations characterization of sequential tests and strengthening a previous result of Li et al. (2017). In particular, the average sample size of our sequential test is bounded universally under any pair of nominal and anomalous distributions and our sequential test achieves larger Bayesian exponent than the fixed-length test, which could not be guaranteed by the sequential test of Li et al. (2017). For the case with at most one outlier, we propose a threshold-based test that has bounded expected stopping time under mild conditions and we bound the exponential decay rate of error probabilities, a.k.a., error exponents, under each non-null hypothesis and the null hypothesis. Our sequential test resolves the tradeoff among the exponential decay rates of misclassification, false reject and false alarm probabilities for the fixed-length test of Zhou et al. (2022). Finally, with a further step towards practical applications, we generalize our results to the cases of multiple outliers and show that there is a penalty in the error exponents when the number of outliers is unknown.
期刊介绍:
The IEEE Transactions on Information Theory is a journal that publishes theoretical and experimental papers concerned with the transmission, processing, and utilization of information. The boundaries of acceptable subject matter are intentionally not sharply delimited. Rather, it is hoped that as the focus of research activity changes, a flexible policy will permit this Transactions to follow suit. Current appropriate topics are best reflected by recent Tables of Contents; they are summarized in the titles of editorial areas that appear on the inside front cover.