{"title":"数据流的序贯非参数k -媒质聚类","authors":"Sreeram C. Sreenivasan, S. Bhashyam","doi":"10.1109/NCC55593.2022.9806794","DOIUrl":null,"url":null,"abstract":"We study a sequential nonparametric clustering problem to group a finite set of S data streams into K clusters. The data streams are real-valued i.i.d data sequences generated from unknown continuous distributions. The distributions them-selves are organized into clusters according to their proximity to each other based on a certain distance metric. We propose a universal sequential nonparametric clustering test for the case when K is known. We show that the proposed test stops in finite time almost surely and is universally exponentially consistent. We also bound the asymptotic growth rate of the expected stopping time as probability of error goes to zero. Our results generalize earlier work on sequential nonparametric anomaly detection to the more general sequential nonparametric clustering problem, thereby providing a new test for case of anomaly detection where the anomalous data streams can follow distinct probability distributions. Simulations show that our proposed sequential clustering test outperforms the corresponding fixed sample size test.","PeriodicalId":403870,"journal":{"name":"2022 National Conference on Communications (NCC)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Sequential Nonparametric K-Medoid Clustering of Data Streams\",\"authors\":\"Sreeram C. Sreenivasan, S. Bhashyam\",\"doi\":\"10.1109/NCC55593.2022.9806794\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We study a sequential nonparametric clustering problem to group a finite set of S data streams into K clusters. The data streams are real-valued i.i.d data sequences generated from unknown continuous distributions. The distributions them-selves are organized into clusters according to their proximity to each other based on a certain distance metric. We propose a universal sequential nonparametric clustering test for the case when K is known. We show that the proposed test stops in finite time almost surely and is universally exponentially consistent. We also bound the asymptotic growth rate of the expected stopping time as probability of error goes to zero. Our results generalize earlier work on sequential nonparametric anomaly detection to the more general sequential nonparametric clustering problem, thereby providing a new test for case of anomaly detection where the anomalous data streams can follow distinct probability distributions. Simulations show that our proposed sequential clustering test outperforms the corresponding fixed sample size test.\",\"PeriodicalId\":403870,\"journal\":{\"name\":\"2022 National Conference on Communications (NCC)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 National Conference on Communications (NCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCC55593.2022.9806794\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 National Conference on Communications (NCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCC55593.2022.9806794","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sequential Nonparametric K-Medoid Clustering of Data Streams
We study a sequential nonparametric clustering problem to group a finite set of S data streams into K clusters. The data streams are real-valued i.i.d data sequences generated from unknown continuous distributions. The distributions them-selves are organized into clusters according to their proximity to each other based on a certain distance metric. We propose a universal sequential nonparametric clustering test for the case when K is known. We show that the proposed test stops in finite time almost surely and is universally exponentially consistent. We also bound the asymptotic growth rate of the expected stopping time as probability of error goes to zero. Our results generalize earlier work on sequential nonparametric anomaly detection to the more general sequential nonparametric clustering problem, thereby providing a new test for case of anomaly detection where the anomalous data streams can follow distinct probability distributions. Simulations show that our proposed sequential clustering test outperforms the corresponding fixed sample size test.