{"title":"On finding skylines in external memory","authors":"Cheng Sheng, Yufei Tao","doi":"10.1145/1989284.1989298","DOIUrl":null,"url":null,"abstract":"We consider the <i>skyline problem</i> (a.k.a. the <i>maxima problem</i>), which has been extensively studied in the database community. The input is a set <i>P</i> of <i>d</i>-dimensional points. A point <i>dominates</i> another if the former has a lower coordinate than the latter on every dimension. The goal is to find the <i>skyline</i>, which is the set of points <i>p</i> ∈ <i>P</i> such that <i>p</i> is not dominated by any other data point. In the external-memory model, the 2-d version of the problem is known to be solvable in <i>O</i>((<i>N</i>/<i>B</i>)log<i><sub>M/B</sub></i>(<i>N</i>/<i>B</i>)) I/Os, where <i>N</i> is the cardinality of <i>P</i>, <i>B</i> the size of a disk block, and <i>M</i> the capacity of main memory. For fixed <i>d</i> ≥ 3, we present an algorithm with I/O-complexity <i>O</i>((<i>N</i>/<i>B</i>)log<i>d</i>-2/<i>M</i>/<i>B</i>(<i>N</i>/<i>B</i>)). Previously, the best solution was adapted from an in-memory algorithm, and requires <i>O</i>((<i>N</i>/<i>B</i>) log<i>d</i>-2/2(<i>N</i>/<i>M</i>)) I/Os.","PeriodicalId":92118,"journal":{"name":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","volume":"12 1","pages":"107-116"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"43","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1989284.1989298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 43
Abstract
We consider the skyline problem (a.k.a. the maxima problem), which has been extensively studied in the database community. The input is a set P of d-dimensional points. A point dominates another if the former has a lower coordinate than the latter on every dimension. The goal is to find the skyline, which is the set of points p ∈ P such that p is not dominated by any other data point. In the external-memory model, the 2-d version of the problem is known to be solvable in O((N/B)logM/B(N/B)) I/Os, where N is the cardinality of P, B the size of a disk block, and M the capacity of main memory. For fixed d ≥ 3, we present an algorithm with I/O-complexity O((N/B)logd-2/M/B(N/B)). Previously, the best solution was adapted from an in-memory algorithm, and requires O((N/B) logd-2/2(N/M)) I/Os.