{"title":"面向历史流挖掘的多维数据流的有效汇总","authors":"Samer Nassar, J. Sander","doi":"10.1109/SSDBM.2007.32","DOIUrl":null,"url":null,"abstract":"We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress \"similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining\",\"authors\":\"Samer Nassar, J. Sander\",\"doi\":\"10.1109/SSDBM.2007.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress \\\"similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.\",\"PeriodicalId\":122925,\"journal\":{\"name\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSDBM.2007.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDBM.2007.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining
We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress "similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.