刺穿天空:滑动窗口上高效的天际线计算

Xuemin Lin, Yidong Yuan, Wei Wang, Hongjun Lu
{"title":"刺穿天空:滑动窗口上高效的天际线计算","authors":"Xuemin Lin, Yidong Yuan, Wei Wang, Hongjun Lu","doi":"10.1109/ICDE.2005.137","DOIUrl":null,"url":null,"abstract":"We consider the problem of efficiently computing the skyline against the most recent N elements in a data stream seen so far. Specifically, we study the n-of-N skyline queries; that is, computing the skyline for the most recent n (/spl forall/n/spl les/N) elements. Firstly, we developed an effective pruning technique to minimize the number of elements to be kept. It can be shown that on average storing only O(log/sup d/ N) elements from the most recent N elements is sufficient to support the precise computation of all n-of-N skyline queries in a d-dimension space if the data distribution on each dimension is independent. Then, a novel encoding scheme is proposed, together with efficient update techniques, for the stored elements, so that computing an n-of-N skyline query in a d-dimension space takes O(log N+s) time that is reduced to O(d log log N+s) if the data distribution is independent, where s is the number of skyline points. Thirdly, a novel trigger based technique is provided to process continuous n-of-N skyline queries with O(/spl delta/) time to update the current result per new data element and O(log s) time to update the trigger list per result change, where /spl delta/ is the number of element changes from the current result to the new result. Finally, we extend our techniques to computing the skyline against an arbitrary window in the most recent N element. Besides theoretical performance guarantees, our extensive experiments demonstrated that the new techniques can support on-line skyline query computation over very rapid data streams.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"290","resultStr":"{\"title\":\"Stabbing the sky: efficient skyline computation over sliding windows\",\"authors\":\"Xuemin Lin, Yidong Yuan, Wei Wang, Hongjun Lu\",\"doi\":\"10.1109/ICDE.2005.137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of efficiently computing the skyline against the most recent N elements in a data stream seen so far. Specifically, we study the n-of-N skyline queries; that is, computing the skyline for the most recent n (/spl forall/n/spl les/N) elements. Firstly, we developed an effective pruning technique to minimize the number of elements to be kept. It can be shown that on average storing only O(log/sup d/ N) elements from the most recent N elements is sufficient to support the precise computation of all n-of-N skyline queries in a d-dimension space if the data distribution on each dimension is independent. Then, a novel encoding scheme is proposed, together with efficient update techniques, for the stored elements, so that computing an n-of-N skyline query in a d-dimension space takes O(log N+s) time that is reduced to O(d log log N+s) if the data distribution is independent, where s is the number of skyline points. Thirdly, a novel trigger based technique is provided to process continuous n-of-N skyline queries with O(/spl delta/) time to update the current result per new data element and O(log s) time to update the trigger list per result change, where /spl delta/ is the number of element changes from the current result to the new result. Finally, we extend our techniques to computing the skyline against an arbitrary window in the most recent N element. Besides theoretical performance guarantees, our extensive experiments demonstrated that the new techniques can support on-line skyline query computation over very rapid data streams.\",\"PeriodicalId\":297231,\"journal\":{\"name\":\"21st International Conference on Data Engineering (ICDE'05)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-04-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"290\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"21st International Conference on Data Engineering (ICDE'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2005.137\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"21st International Conference on Data Engineering (ICDE'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2005.137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 290

摘要

我们考虑的问题是有效地计算天际线对最近的N个元素在数据流中看到到目前为止。具体来说,我们研究了n (n)个skyline查询;也就是说,计算最近n个元素的天际线(所有/n/spl元素/n个元素/spl)。首先,我们开发了一种有效的修剪技术,以尽量减少需要保留的元素数量。可以证明,如果每个维度上的数据分布是独立的,则平均仅存储最近N个元素中的O(log/sup d/ N)个元素足以支持d维空间中所有N (N)个skyline查询的精确计算。然后,对存储的元素提出了一种新的编码方案和高效的更新技术,使得在d维空间中计算N (N)次天际线查询需要O(log N+s)时间,如果数据分布是独立的,则需要O(d log log N+s)时间,其中s为天际线点的个数。第三,提出了一种新颖的基于触发器的技术来处理连续的n-of-N天际线查询,每次新数据元素更新当前结果的时间为O(/spl delta/),每次结果变化更新触发列表的时间为O(log s),其中/spl delta/为从当前结果到新结果的元素变化的次数。最后,我们将我们的技术扩展到针对最近N元素的任意窗口计算天际线。除了理论上的性能保证外,我们的大量实验表明,新技术可以在非常快速的数据流上支持在线天际线查询计算。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Stabbing the sky: efficient skyline computation over sliding windows
We consider the problem of efficiently computing the skyline against the most recent N elements in a data stream seen so far. Specifically, we study the n-of-N skyline queries; that is, computing the skyline for the most recent n (/spl forall/n/spl les/N) elements. Firstly, we developed an effective pruning technique to minimize the number of elements to be kept. It can be shown that on average storing only O(log/sup d/ N) elements from the most recent N elements is sufficient to support the precise computation of all n-of-N skyline queries in a d-dimension space if the data distribution on each dimension is independent. Then, a novel encoding scheme is proposed, together with efficient update techniques, for the stored elements, so that computing an n-of-N skyline query in a d-dimension space takes O(log N+s) time that is reduced to O(d log log N+s) if the data distribution is independent, where s is the number of skyline points. Thirdly, a novel trigger based technique is provided to process continuous n-of-N skyline queries with O(/spl delta/) time to update the current result per new data element and O(log s) time to update the trigger list per result change, where /spl delta/ is the number of element changes from the current result to the new result. Finally, we extend our techniques to computing the skyline against an arbitrary window in the most recent N element. Besides theoretical performance guarantees, our extensive experiments demonstrated that the new techniques can support on-line skyline query computation over very rapid data streams.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信