{"title":"有效的双向顺序依赖发现","authors":"Yifeng Jin, Lin Zhu, Zijing Tan","doi":"10.1109/ICDE48307.2020.00013","DOIUrl":null,"url":null,"abstract":"Bidirectional order dependencies state relationships of order between lists of attributes. They naturally model the order-by clauses in SQL queries, and are proved effective in query optimizations concerning sorting. Despite their importance, order dependencies on a dataset are typically unknown and are too costly, if not impossible, to design or discover manually. Techniques for automatic order dependency discovery are recently studied. It is challenging for order dependency discovery to scale well, since it is by nature factorial in the number m of attributes and quadratic in the number n of tuples. In this paper, we adopt a strategy that decouples the impact of m from that of n, and that still finds all minimal valid bidirectional order dependencies. We present carefully designed data structures, a host of algorithms and optimizations, for efficient order dependency discovery. With extensive experimental studies on both real-life and synthetic datasets, we verify our approach significantly outperforms state-of-the-art techniques, by orders of magnitude.","PeriodicalId":6709,"journal":{"name":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","volume":"56 1","pages":"61-72"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Efficient Bidirectional Order Dependency Discovery\",\"authors\":\"Yifeng Jin, Lin Zhu, Zijing Tan\",\"doi\":\"10.1109/ICDE48307.2020.00013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bidirectional order dependencies state relationships of order between lists of attributes. They naturally model the order-by clauses in SQL queries, and are proved effective in query optimizations concerning sorting. Despite their importance, order dependencies on a dataset are typically unknown and are too costly, if not impossible, to design or discover manually. Techniques for automatic order dependency discovery are recently studied. It is challenging for order dependency discovery to scale well, since it is by nature factorial in the number m of attributes and quadratic in the number n of tuples. In this paper, we adopt a strategy that decouples the impact of m from that of n, and that still finds all minimal valid bidirectional order dependencies. We present carefully designed data structures, a host of algorithms and optimizations, for efficient order dependency discovery. With extensive experimental studies on both real-life and synthetic datasets, we verify our approach significantly outperforms state-of-the-art techniques, by orders of magnitude.\",\"PeriodicalId\":6709,\"journal\":{\"name\":\"2020 IEEE 36th International Conference on Data Engineering (ICDE)\",\"volume\":\"56 1\",\"pages\":\"61-72\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE 36th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE48307.2020.00013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 36th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE48307.2020.00013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Bidirectional Order Dependency Discovery
Bidirectional order dependencies state relationships of order between lists of attributes. They naturally model the order-by clauses in SQL queries, and are proved effective in query optimizations concerning sorting. Despite their importance, order dependencies on a dataset are typically unknown and are too costly, if not impossible, to design or discover manually. Techniques for automatic order dependency discovery are recently studied. It is challenging for order dependency discovery to scale well, since it is by nature factorial in the number m of attributes and quadratic in the number n of tuples. In this paper, we adopt a strategy that decouples the impact of m from that of n, and that still finds all minimal valid bidirectional order dependencies. We present carefully designed data structures, a host of algorithms and optimizations, for efficient order dependency discovery. With extensive experimental studies on both real-life and synthetic datasets, we verify our approach significantly outperforms state-of-the-art techniques, by orders of magnitude.