Proc. VLDB Endow.最新文献

筛选
英文 中文
Why Not Yet: Fixing a Top-k Ranking that Is Not Fair to Individuals 为什么不:修改对个人不公平的前k排名
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598606
Zixuan Chen, P. Manolios, Mirek Riedewald
{"title":"Why Not Yet: Fixing a Top-k Ranking that Is Not Fair to Individuals","authors":"Zixuan Chen, P. Manolios, Mirek Riedewald","doi":"10.14778/3598581.3598606","DOIUrl":"https://doi.org/10.14778/3598581.3598606","url":null,"abstract":"This work considers why-not questions in the context of top-k queries and score-based ranking functions. Following the popular linear scalarization approach for multi-objective optimization, we study rankings based on the weighted sum of multiple scores. A given weight choice may be controversial or perceived as unfair to certain individuals or organizations, triggering the question why some entity of interest has not yet shown up in the top-k. We introduce various notions of such why-not-yet queries and formally define them as satisfiability or optimization problems, whose goal is to propose alternative ranking functions that address the placement of the entities of interest. While some why-not-yet problems have linear constraints, others require quantifiers, disjunction, and negation. We propose several optimizations, ranging from a monotonic-core construction that approximates the complex constraints with a conjunction of linear ones, to various techniques that let the user control the tradeoff between running time and approximation quality. Experiments with real and synthetic data demonstrate the practicality and scalability of our technique, showing its superiority compared to the state of the art (SOA).","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"16 1","pages":"2377-2390"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78534442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LEON: A New Framework for ML-Aided Query Optimization 一个新的机器学习辅助查询优化框架
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598597
Xu Chen, Haitian Chen, Zibo Liang, Shuncheng Liu, Jinghong Wang, Kai Zeng, Han Su, Kai Zheng
{"title":"LEON: A New Framework for ML-Aided Query Optimization","authors":"Xu Chen, Haitian Chen, Zibo Liang, Shuncheng Liu, Jinghong Wang, Kai Zeng, Han Su, Kai Zheng","doi":"10.14778/3598581.3598597","DOIUrl":"https://doi.org/10.14778/3598581.3598597","url":null,"abstract":"\u0000 Query optimization has long been a fundamental yet challenging topic in the database field. With the prosperity of machine learning (ML), some recent works have shown the advantages of reinforcement learning (RL) based learned query optimizer. However, they suffer from fundamental limitations due to the data-driven nature of ML. Motivated by the ML characteristics and database maturity, we propose\u0000 LEON\u0000 -a framework for ML-aidEd query OptimizatioN.\u0000 LEON\u0000 improves the expert query optimizer to self-adjust to the particular deployment by leveraging ML and the fundamental knowledge in the expert query optimizer. To train the ML model, a pairwise ranking objective is proposed, which is substantially different from the previous regression objective. To help the optimizer to escape the local minima and avoid failure, a ranking and uncertainty-based exploration strategy is proposed, which discovers the valuable plans to aid the optimizer. Furthermore, an ML model-guided pruning is proposed to increase the planning efficiency without hurting too much performance. Extensive experiments offer evidence that the proposed framework can outperform the state-of-the-art methods in terms of end-to-end latency performance, training efficiency, and stability.\u0000","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"1 1","pages":"2261-2273"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72862331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification 利用二部匹配和单元范围验证探索紧凑搜索空间
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598591
Yunyoung Choi, Kunsoo Park, Hyunjoon Kim
{"title":"BICE: Exploring Compact Search Space by Using Bipartite Matching and Cell-Wide Verification","authors":"Yunyoung Choi, Kunsoo Park, Hyunjoon Kim","doi":"10.14778/3598581.3598591","DOIUrl":"https://doi.org/10.14778/3598581.3598591","url":null,"abstract":"Subgraph matching is the problem of searching for all embeddings of a query graph in a data graph, and subgraph query processing (also known as subgraph search) is to find all the data graphs that contain a query graph as subgraphs. Extensive research has been done to develop practical solutions for both problems. However, the existing solutions still show limited query processing time due to a lot of unnecessary computations in search. In this paper, we focus on exploring as compact search space as possible by using three techniques: (1) pruning by bipartite matching, (2) pruning by failing sets with bipartite matching, and (3) cell-wide verification. We propose a new algorithm BICE, which combines these three techniques. We conduct extensive experiments on real-world datasets as well as synthetic datasets to evaluate the effectiveness of the techniques. Experiments show that our approach outperforms the fastest existing subgraph search algorithm by up to two orders of magnitude in terms of elapsed time to process a query. Our approach also outperforms state-of-the-art subgraph matching algorithms by up to two orders of magnitude.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"19 1","pages":"2186-2198"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83800896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text Indexing for Long Patterns: Anchors are All you Need 长模式的文本索引:锚是你所需要的
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598586
Lorraine A. K. Ayad, G. Loukides, S. Pissis
{"title":"Text Indexing for Long Patterns: Anchors are All you Need","authors":"Lorraine A. K. Ayad, G. Loukides, S. Pissis","doi":"10.14778/3598581.3598586","DOIUrl":"https://doi.org/10.14778/3598581.3598586","url":null,"abstract":"\u0000 In many real-world database systems, a large fraction of the data is represented by strings: sequences of letters over some alphabet. This is because strings can easily encode data arising from different sources. It is often crucial to represent such string datasets in a compact form but also to\u0000 simultaneously\u0000 enable fast pattern matching queries. This is the classic text indexing problem. The four absolute measures anyone should pay attention to when designing or implementing a text index are:\u0000 (i)\u0000 index space;\u0000 (ii)\u0000 query time;\u0000 (iii)\u0000 construction space; and\u0000 (iv)\u0000 construction time. Unfortunately, however, most (if not all) widely-used indexes (e.g., suffix tree, suffix array, or their compressed counterparts) are not optimized for all four measures simultaneously, as it is difficult to have the best of all four worlds. Here, we take an important step in this direction by showing that text indexing with locally consistent anchors (lc-anchors) offers remarkably good performance in all four measures, when we have at hand a lower bound\u0000 l\u0000 on the length of the queried patterns --- which is arguably a quite reasonable assumption in practical applications. Specifically, we improve on the construction of the index proposed by Loukides and Pissis, which is based on bidirectional string anchors (bd-anchors), a new type of lc-anchors, by:\u0000 (i)\u0000 designing an average-case linear-time algorithm to compute bd-anchors; and\u0000 (ii)\u0000 developing a semi-external-memory implementation to construct the index in small space using near-optimal work. We then present an extensive experimental evaluation, based on the four measures, using real benchmark datasets. The results show that, for long patterns, the index constructed using our improved algorithms compares favorably to all classic indexes: (compressed) suffix tree; (compressed) suffix array; and the FM-index.\u0000","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"34 1","pages":"2117-2131"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91242008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Maximal D-truss Search in Dynamic Directed Graphs 动态有向图中的最大d -桁架搜索
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598592
Anxin Tian, Alexander Zhou, Yue Wang, Lei Chen
{"title":"Maximal D-truss Search in Dynamic Directed Graphs","authors":"Anxin Tian, Alexander Zhou, Yue Wang, Lei Chen","doi":"10.14778/3598581.3598592","DOIUrl":"https://doi.org/10.14778/3598581.3598592","url":null,"abstract":"Community search (CS) aims at personalized subgraph discovery which is the key to understanding the organisation of many real-world networks. CS in undirected networks has attracted significant attention from researchers, including many solutions for various cohesive subgraph structures and for different levels of dynamism with edge insertions and deletions, while they are much less considered for directed graphs. In this paper, we propose incremental solutions of CS based on the D-truss in dynamic directed graphs, where the D-truss is a cohesive subgraph structure defined based on two types of triangles in directed graphs. We first analyze the theoretical boundedness of D-truss given edge insertions and deletions, then we present basic single-update algorithms. To improve the efficiency, we propose an order-based D-Index, associated batch-update algorithms and a fully-dynamic query algorithm. Our extensive experiments on real-world graphs show that our proposed solution achieves a significant speedup compared to the SOTA solution, the scalability over updates is also verified.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"46 1","pages":"2199-2211"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87330597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Pando: Enhanced Data Skipping with Logical Data Partitioning Pando:增强数据跳跃与逻辑数据分区
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598601
Sivaprasad Sudhir, Wenbo Tao, N. Laptev, Cyrille Habis, Michael J. Cafarella, S. Madden
{"title":"Pando: Enhanced Data Skipping with Logical Data Partitioning","authors":"Sivaprasad Sudhir, Wenbo Tao, N. Laptev, Cyrille Habis, Michael J. Cafarella, S. Madden","doi":"10.14778/3598581.3598601","DOIUrl":"https://doi.org/10.14778/3598581.3598601","url":null,"abstract":"With enormous volumes of data, quickly retrieving data that is relevant to a query is essential for achieving high performance. Modern cloud-based database systems often partition the data into blocks and employ various techniques to skip irrelevant blocks during query execution. Several algorithms, often based on historical properties of a workload of queries run over the data, have been proposed to tune the physical layout of data to reduce the number of blocks accessed. The effectiveness of these methods at skipping blocks depends on what metadata is stored and how well the physical data layout aligns with the queries. Existing work on automatic physical database design misses significant opportunities in skipping blocks because it ignores logical predicates in the workload that exhibit strongly correlated results. In this paper, we present Pando which enables significantly better block skipping than past methods by informing physical layout decisions with correlation-aware logical partitioning. Across a range of benchmark and real-world workloads, Pando attains up to 2.8X reduction in the number of blocks scanned and up to 2.3X speedup in end-to-end query execution time over the state-of-the-art techniques.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"20 1","pages":"2316-2329"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75873552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WiscSort: External Sorting For Byte-Addressable Storage wisscsort:字节可寻址存储的外部排序
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598585
Vinay Banakar, Kan Wu, Yuvraj Patel, K. Keeton, A. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
{"title":"WiscSort: External Sorting For Byte-Addressable Storage","authors":"Vinay Banakar, Kan Wu, Yuvraj Patel, K. Keeton, A. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau","doi":"10.14778/3598581.3598585","DOIUrl":"https://doi.org/10.14778/3598581.3598585","url":null,"abstract":"We present WiscSort, a new approach to high-performance concurrent sorting for existing and future byte-addressable storage (BAS) devices. WiscSort carefully reduces writes, exploits random reads by splitting keys and values during sorting, and performs interference-aware scheduling with thread pool sizing to avoid I/O bandwidth degradation. We introduce the BRAID model which encompasses the unique characteristics of BAS devices. Many state-of-the-art sorting systems do not comply with the BRAID model and deliver sub-optimal performance, whereas WiscSort demonstrates the effectiveness of complying with BRAID. We show that WiscSort is 2-7 x faster than competing approaches on a standard sort benchmark. We evaluate the effectiveness of key-value separation on different key-value sizes and compare our concurrency optimizations with various other concurrency models. Finally, we emulate generic BAS devices and show how our techniques perform well with various combinations of hardware properties.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"125 1","pages":"2103-2116"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90222264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules 数据库量子加速的机会:查询和事务调度的优化
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598603
Umut Çalikyilmaz, Sven Groppe, Jinghua Groppe, Tobias Winker, S. Prestel, Farida Shagieva, Daanish Arya, F. Preis, L. Gruenwald
{"title":"Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction Schedules","authors":"Umut Çalikyilmaz, Sven Groppe, Jinghua Groppe, Tobias Winker, S. Prestel, Farida Shagieva, Daanish Arya, F. Preis, L. Gruenwald","doi":"10.14778/3598581.3598603","DOIUrl":"https://doi.org/10.14778/3598581.3598603","url":null,"abstract":"The capabilities of quantum computers, such as the number of supported qubits and maximum circuit depth, have grown exponentially in recent years. Commercially relevant applications that take advantage of quantum computing are expected to be available soon. In this paper, we shed light on the possibilities of accelerating database tasks using quantum computing with examples of optimizing queries and transaction schedules and present some open challenges for future studies in the field.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"33 1","pages":"2344-2353"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78227965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
SEIDEN: Revisiting Query Processing in Video Database Systems 视频数据库系统中的查询处理
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598599
J. Bang, Gaurav Tarlok Kakkar, Pramod Chunduri, Subrata Mitra, Joy Arulraj
{"title":"SEIDEN: Revisiting Query Processing in Video Database Systems","authors":"J. Bang, Gaurav Tarlok Kakkar, Pramod Chunduri, Subrata Mitra, Joy Arulraj","doi":"10.14778/3598581.3598599","DOIUrl":"https://doi.org/10.14778/3598581.3598599","url":null,"abstract":"State-of-the-art video database management systems (VDBMSs) often use lightweight proxy models to accelerate object retrieval and aggregate queries. The key assumption underlying these systems is that the proxy model is an order of magnitude faster than the heavyweight oracle model. However, recent advances in computer vision have invalidated this assumption. Inference time of recently proposed oracle models is on par with or even lower than the proxy models used in state-of-the-art (SoTA) VDBMSs. This paper presents Seiden, a VDBMS that leverages this radical shift in the runtime gap between the oracle and proxy models. Instead of relying on a proxy model, Seiden directly applies the oracle model over a subset of frames to build a query-agnostic index, and samples additional frames to answer the query using an exploration-exploitation scheme during query processing. By leveraging the temporal continuity of the video and the output of the oracle model on the sampled frames, Seiden delivers faster query processing and better query accuracy than SoTA VDBMSs. Our empirical evaluation shows that Seiden is on average 6.6 x faster than SoTA VDBMSs across diverse queries and datasets.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"85 1","pages":"2289-2301"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75829131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
VeriBench: Analyzing the Performance of Database Systems with Verifiability VeriBench:分析具有可验证性的数据库系统的性能
Proc. VLDB Endow. Pub Date : 2023-05-01 DOI: 10.14778/3598581.3598588
Cong Yue, Meihui Zhang, Changhao Zhu, Gang Chen, Dumitrel Loghin, B. Ooi
{"title":"VeriBench: Analyzing the Performance of Database Systems with Verifiability","authors":"Cong Yue, Meihui Zhang, Changhao Zhu, Gang Chen, Dumitrel Loghin, B. Ooi","doi":"10.14778/3598581.3598588","DOIUrl":"https://doi.org/10.14778/3598581.3598588","url":null,"abstract":"\u0000 Database systems are paying more attention to data security in recent years. Immutable systems such as blockchains, verifiable databases, and ledger databases are equipped with various verifiability mechanisms to protect data. Such systems often adopt different threat models, and techniques, therefore, have different performance implications compared to traditional database systems. So far, there is no uniform benchmarking tool for evaluating the performance of these systems, especially at the level of verification functions. In this paper, we first survey the design space of the\u0000 verifiability-enabled database systems\u0000 along five dimensions: threat model, authenticated data structure (ADS), query processing, verification, and auditing. Based on this survey, we design and implement VeriBench, a benchmark framework for\u0000 verifiability-enabled database systems.\u0000 VeriBench enables a fair comparison of systems designed with different underlying technologies that share the client-side verification scheme, and focuses on design space exploration to provide a deeper understanding of different system design choices. VeriBench incorporates micro- and macro-benchmarks to provide a comprehensive evaluation. Further, VeriBench is designed to enable easy extension for benchmarking new systems and workloads. We run VeriBench to conduct a comprehensive analysis of state-of-the-art systems comprising blockchains, ledger databases, and log transparency technologies. The results expose the weaknesses and strengths of each underlying design choice, and the insights should serve as guidance for future development.\u0000","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"10 1","pages":"2145-2157"},"PeriodicalIF":0.0,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78529444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信