Advances in database technology : proceedings. International Conference on Extending Database Technology最新文献_第2页

UniCache: Efficient Log Replication through Learning Workload Patterns UniCache:通过学习工作负载模式实现高效日志复制

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.39

Harald Ng, Kun Wu, Paris Carbone

引用次数: 0

SonicJoin: Fast, Robust and Worst-case Optimal SonicJoin:快速，稳健和最坏情况最优

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.46

Ahmad Khazaie, H. Pirk

{"title":"SonicJoin: Fast, Robust and Worst-case Optimal","authors":"Ahmad Khazaie, H. Pirk","doi":"10.48786/edbt.2023.46","DOIUrl":"https://doi.org/10.48786/edbt.2023.46","url":null,"abstract":"The establishment of the AGM bound on the size of intermediate results of natural join queries has led to the development of several so-called worst-case join algorithms. These algorithms provably produce intermediate results that are (asymptotically) no larger than the final result of the join. The most notable ones are the Recursive Join , its successor, the Generic Join and the Leapfrog-Trie-Join . While algorithmically efficient, however, all of these algorithms require the availability of index structures that allow tuple lookups using the prefix of a key. Key-prefix-lookups in relational database systems are commonly supported by tree-based index structures since hash-based indices only support full-key lookups. In this paper, we study a wide variety of main-memory-oriented index structures that support key-prefix-lookups with a specific focus on supporting the Generic Join. Based on that study, we develop a novel, best-of-breed index structure called Sonic that combines the fast build and point lookup properties of hashtables with the prefix-lookups capabilities of trees and tries. To evaluate the performance of a variety of indices for worst-case optimal joins in a modern code-generating DBMS, we leveraged flexible, compile-time metaprogramming features to build a framework that creates highly efficient code, interweaving (at a microarchitectural level) a generic join implementation with any appropriate index structure. We demonstrate experimentally that in that framework, Sonic outperforms the fastest existing approaches by up to 2.5 times when supporting the Generic Join algorithm.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"10 1","pages":"540-551"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75143850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reasoning over Financial Scenarios with the Vadalog System 用Vadalog系统对金融场景进行推理

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.66

Teodoro Baldazzi, Luigi Bellomarini, Emanuel Sallinger

引用次数: 0

Tuning the Utility-Privacy Trade-Off in Trajectory Data 轨迹数据中效用与隐私权衡的调优

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.78

Maja Schneider, P. Christen, E. Rahm, Jonathan Schneider, Lea Löffelmann

{"title":"Tuning the Utility-Privacy Trade-Off in Trajectory Data","authors":"Maja Schneider, P. Christen, E. Rahm, Jonathan Schneider, Lea Löffelmann","doi":"10.48786/edbt.2023.78","DOIUrl":"https://doi.org/10.48786/edbt.2023.78","url":null,"abstract":"Trajectory data, often collected on a large scale with mobile sensors in smartphones and vehicles, are a valuable source for realiz-ing smart city applications, or for improving the user experience in mobile apps. But such data can also leak private information, such as a person’s whereabouts and their points of interest (POI). These in turn can reveal sensitive information, for example a person’s age, gender, religion, or home and work address. Location privacy preserving mechanisms (LPPM) can mitigate this issue by transforming data so that private details are protected. But privacy-preservation typically comes at the cost of a loss of utility. It can be challenging to find a suitable mechanism and the right settings to satisfy privacy as well as utility. In this work, we present Privacy Tuna, an interactive open-source framework to visualize trajectory data, and intuitively estimate data utility and privacy while applying various LPPMs. Our tool makes it easy for data owners to investigate the value of their data, choose a suitable privacy-preserving mechanism and tune its parameters to achieve a good utility-privacy trade-off.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"108 1","pages":"839-842"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85339441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

In-Network Approximate and Efficient Spatiotemporal Range Queries on Moving Objects 运动对象的网络近似和高效时空距离查询

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2024.04

Guang Yang, Liang Liang

{"title":"In-Network Approximate and Efficient Spatiotemporal Range Queries on Moving Objects","authors":"Guang Yang, Liang Liang","doi":"10.48786/edbt.2024.04","DOIUrl":"https://doi.org/10.48786/edbt.2024.04","url":null,"abstract":"Data aggregations enable privacy-aware data analytics for moving objects. A spatiotemporal range count query is a fundamental query that aggregates the count of objects in a given spatial region and a time interval. Existing works are designed for centralized systems, which lead to issues with extensive communication and the potential for data leaks. Current in-network systems suffer from the distinct count problem (counting the same objects multiple times) and the dead space problem (excessive intra-communication from ill-suited spatial subdivisions). We propose a novel framework based on a planar graph representation for efficient privacy-aware in-network aggregate queries. Unlike conventional spatial decomposition methods, our framework uses sensor placement techniques to select sensors to reduce dead space. A submodular maximization-based method is introduced when the query distribution is known and a host of sampling methods are used when the query distribution is unknown or dynamic. We avoid double counting by tracking movements along the graph edges using discrete differential forms. We support queries with arbitrary temporal intervals with a constant-sized regression model that accelerates the query performance and reduces the storage size. We evaluate our method on real-world mobility data, which yields us a relative error of at most 13 . 8% with 25 . 6% of sensors while achieving a speedup of 3 . 5 × , 69 . 81% reduction in sensors accessed, and a storage reduction of 99 . 96% compared to finding the exact count.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"2 1","pages":"34-46"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90721274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incremental Stream Query Merging 增量流查询合并

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.51

Ankit Chaudhary, Steffen Zeuch, V. Markl, Jeyhun Karimov

引用次数: 1

TempoGRAPHer: A Tool for Aggregating and Exploring Evolving Graphs TempoGRAPHer:一个用于聚合和探索演化图的工具

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.79

Evangelia Tsoukanara, Georgia Koloniari, E. Pitoura

引用次数: 1

RDF-Analytics: Interactive Analytics over RDF Knowledge Graphs RDF-Analytics:基于RDF知识图的交互式分析

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.70

Maria-Evangelia Papadaki, Yannis Tzitzikas

引用次数: 0

Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models 通过探索预训练语言模型的力量从大量文本中挖掘结构

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.81

Yu Zhang, Yunyi Zhang, Jiawei Han

{"title":"Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models","authors":"Yu Zhang, Yunyi Zhang, Jiawei Han","doi":"10.48786/edbt.2023.81","DOIUrl":"https://doi.org/10.48786/edbt.2023.81","url":null,"abstract":"Technologies for handling massive structured or semi-structured data have been researched extensively in database communities. However, the real-world data are largely in the form of unstructured text, posing a great challenge to their management and analysis as well as their integration with semi-structured databases. Recent developments of deep learning methods and large pre-trained language models (PLMs) have revolutionized text mining and processing and shed new light on structuring massive text data and building a framework for integrated (i.e., structured and unstructured) data management and analysis. In this tutorial, we will focus on the recently developed text mining approaches empowered by PLMs that can work without relying on heavy human annotations. We will present an organized picture of how a set of weakly supervised methods explore the power of PLMs to structure text data, with the following outline: (1) an introduction to pre-trained languagemodels that serve as new tools for our tasks, (2) mining topic structures: unsupervised and seed-guided methods for topic discovery from massive text corpora, (3) mining document structures: weakly supervised methods for text classification, (4) mining entity structures: distantly supervised and weakly supervised methods for phrase mining, named entity recognition, taxonomy construction, and structured knowledge graph construction, and (5) towards an integrated information processing paradigm. 1 BACKGROUND, GOALS, AND DURATION The massive text data available on the Web, social media, news, scientific literature, government reports, and other information sources contain rich knowledge that can potentially benefit a wide variety of information processing tasks, and they can be potentially structured and analyzed by extended database technologies. For example, one can conduct entity recognition and concept ontology construction on a large collection of scientific papers and extract the factual knowledge for knowledge base construction and subsequent analysis. How to effectively leverage the unstructured massive text data for downstream applications has remained an important and active research question for the past few decades. Recently, pre-trained language models (PLMs) such as BERT [6] have revolutionized the text mining field and brought new inspirations to structuring text data. To be specific, the following paradigm is usually adopted: pre-training neural architectures on large-scale text corpora obtained from the world knowledge (e.g., a combination of Wikipedia, books, scientific corpora, and web content), and then transferring their representations to task-specific data. By doing so, the knowledge encoded in the world corpora can be effectively leveraged to enhance © 2023 Copyright held by the owner/author(s). Published in Proceedings of the 26th International Conference on Extending Database Technology (EDBT), 28th March-31st March, 2023, ISBN 978-3-89318-092-9 on OpenProceedings.org. ","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"108 1","pages":"851-854"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75928134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Formal Design Framework for Practical Property Graph Schema Languages 实用属性图模式语言的形式化设计框架

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.40

Nimo Beeren, G. Fletcher

引用次数: 0