{"title":"An MDL-based change-detection algorithm with its applications to learning piecewise stationary memoryless sources","authors":"Hiroki Kanazawa, K. Yamanishi","doi":"10.1109/ITW.2012.6404736","DOIUrl":null,"url":null,"abstract":"Kleinberg has proposed an algorithm for detecting bursts from a data sequence, which has turned out to be effective in the scenario of data mining, such as topic detection, change-detection. In this paper we extend Kleinberg's algorithm in an information-theoretic fashion to obtain a new class of algorithms and apply it into learning of piecewise stationary memoryless sources (PSMSs). The keys of the proposed algorithm are; 1) the parameter space is discretized so that discretization scale depends on the Fisher information, and 2) the optimal path over the discretized parameter space is efficiently computed using the dynamic programming method so that the sum of the data and parameter description lengths is minimized on the basis of the MDL principle. We prove that an upper bound on the total code-length for the proposed algorithm asymptotically matches Merhav's lower bound.","PeriodicalId":325771,"journal":{"name":"2012 IEEE Information Theory Workshop","volume":"42 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Information Theory Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITW.2012.6404736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Kleinberg has proposed an algorithm for detecting bursts from a data sequence, which has turned out to be effective in the scenario of data mining, such as topic detection, change-detection. In this paper we extend Kleinberg's algorithm in an information-theoretic fashion to obtain a new class of algorithms and apply it into learning of piecewise stationary memoryless sources (PSMSs). The keys of the proposed algorithm are; 1) the parameter space is discretized so that discretization scale depends on the Fisher information, and 2) the optimal path over the discretized parameter space is efficiently computed using the dynamic programming method so that the sum of the data and parameter description lengths is minimized on the basis of the MDL principle. We prove that an upper bound on the total code-length for the proposed algorithm asymptotically matches Merhav's lower bound.