Yi Wang, Yi Ding, Xiangjian He, Xin Fan, Chi Lin, Fengqi Li, Tianzhu Wang, Zhongxuan Luo, Jiebo Luo
{"title":"Novelty Detection and Online Learning for Chunk Data Streams.","authors":"Yi Wang, Yi Ding, Xiangjian He, Xin Fan, Chi Lin, Fengqi Li, Tianzhu Wang, Zhongxuan Luo, Jiebo Luo","doi":"10.1109/TPAMI.2020.2965531","DOIUrl":null,"url":null,"abstract":"<p><p>Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.</p>","PeriodicalId":13426,"journal":{"name":"IEEE Transactions on Pattern Analysis and Machine Intelligence","volume":"43 7","pages":"2400-2412"},"PeriodicalIF":20.8000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TPAMI.2020.2965531","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Pattern Analysis and Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1109/TPAMI.2020.2965531","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/6/8 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 6
Abstract
Datastream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while incrementally updating the model efficiently and stably, especially for high-dimensional and/or large-scale data streams. This paper proposes an efficient framework for novelty detection and incremental learning for unlabeled chunk data streams. First, an accurate factorization-free kernel discriminative analysis (FKDA-X) is put forward through solving a linear system in the kernel space. FKDA-X produces a Reproducing Kernel Hilbert Space (RKHS), in which unlabeled chunk data can be detected and classified by multiple known-classes in a single decision model with a deterministic classification boundary. Moreover, based on FKDA-X, two optimal methods FKDA-CX and FKDA-C are proposed. FKDA-CX uses the micro-cluster centers of original data as the input to achieve excellent performance in novelty detection. FKDA-C and incremental FKDA-C (IFKDA-C) using the class centers of original data as their input have extremely fast speed in online learning. Theoretical analysis and experimental validation on under-sampled and large-scale real-world datasets demonstrate that the proposed algorithms make it possible to learn unlabeled chunk data streams with significantly lower computational costs and comparable accuracies than the state-of-the-art approaches.
期刊介绍:
The IEEE Transactions on Pattern Analysis and Machine Intelligence publishes articles on all traditional areas of computer vision and image understanding, all traditional areas of pattern analysis and recognition, and selected areas of machine intelligence, with a particular emphasis on machine learning for pattern analysis. Areas such as techniques for visual search, document and handwriting analysis, medical image analysis, video and image sequence analysis, content-based retrieval of image and video, face and gesture recognition and relevant specialized hardware and/or software architectures are also covered.