面向历史流挖掘的多维数据流的有效汇总

Samer Nassar, J. Sander
{"title":"面向历史流挖掘的多维数据流的有效汇总","authors":"Samer Nassar, J. Sander","doi":"10.1109/SSDBM.2007.32","DOIUrl":null,"url":null,"abstract":"We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress \"similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.","PeriodicalId":122925,"journal":{"name":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining\",\"authors\":\"Samer Nassar, J. Sander\",\"doi\":\"10.1109/SSDBM.2007.32\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress \\\"similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.\",\"PeriodicalId\":122925,\"journal\":{\"name\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SSDBM.2007.32\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SSDBM.2007.32","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

摘要

我们考虑以下问题:给定一个非常大的数据流,一个有限的数据流编码空间,以及压缩数据流的压缩技术,在保留数据流遥远过去的最重要信息的同时,保留数据流最近部分的高质量压缩信息,对汇总信息进行时间分析。先前提出的用于积累流窗口微聚类摘要的简单方案对于解决这一具有挑战性的任务是非常无效的。我们首先通过识别压缩数据空间中“相似”区域的空间摘要来克服这些方案的局限性,并使用新的近似时空摘要来减少它们的空间消耗。其次,我们提出了有效利用空间预算和管理这些新的近似时空摘要的政策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Effective Summarization of Multi-Dimensional Data Streams for Historical Stream Mining
We consider the following problem: given a very large data stream, a limited space to encode the stream, and a compression technique to compress the stream, retain the most important information from the distant past of the stream while at the same time retain high quality of the compressed information that is in the recent part of the stream to perform temporal analysis of the summarized information. Simple schemes for accumulating micro-clustering summaries of stream windows that have been previously proposed are very ineffective for solving this challenging task. We overcome the limitations of these schemes by first identifying spatial summaries that compress "similar' regions in the data space, and reduce their space consumption using novel approximate spatio-temporal summaries. Second, we present policies for effectively utilizing the space budget and managing these novel approximate spatio-temporal summaries.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信