Advances in database technology : proceedings. International Conference on Extending Database Technology最新文献

筛选
英文 中文
Topio Marketplace: Search and Discovery of Geospatial Data Topio市场:地理空间数据的搜索和发现
Andra Ionescu, A. Alexandridou, Leonidas Ikonomou, Kyriakos Psarakis, Kostas Patroumpas, Georgios Chatzigeorgakidis, Dimitrios Skoutas, Spiros Athanasiou, Rihan Hai, Asterios Katsifodimos
{"title":"Topio Marketplace: Search and Discovery of Geospatial Data","authors":"Andra Ionescu, A. Alexandridou, Leonidas Ikonomou, Kyriakos Psarakis, Kostas Patroumpas, Georgios Chatzigeorgakidis, Dimitrios Skoutas, Spiros Athanasiou, Rihan Hai, Asterios Katsifodimos","doi":"10.48786/edbt.2023.73","DOIUrl":"https://doi.org/10.48786/edbt.2023.73","url":null,"abstract":"The increasing need for data trading has created a high demand for data marketplaces. These marketplaces require a set of value-added services, such as advanced search and discovery, that have been proposed in the database research community for years, but are yet to be put to practice. In this paper we propose to demonstrate the Topio Marketplace, an open-source data market platform that facilitates the search, exploration, discovery and augmentation of data assets. To support filtering, searching and discovery of data assets, we developed methods to extract and visualise a variety of metadata, as well as methods to discover related assets and mechanism to augment them. This paper aims at presenting these methods with a real deployment of the Topio marketplace, comprising hundreds of open and proprietary datasets.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"114 1","pages":"819-822"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77595192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FLIRT: A Fast Learned Index for Rolling Time frames 调情:滚动时间框架的快速学习索引
Guang Yang, Liang Liang, A. Hadian, T. Heinis
{"title":"FLIRT: A Fast Learned Index for Rolling Time frames","authors":"Guang Yang, Liang Liang, A. Hadian, T. Heinis","doi":"10.48786/edbt.2023.19","DOIUrl":"https://doi.org/10.48786/edbt.2023.19","url":null,"abstract":"Efficiently managing and querying sliding windows is a key com-ponent in stream processing systems. Conventional index structures such as the B+Tree are not efficient for handling a stream of time-series data, where the data is very dynamic, and the indexes must be updated on a continuous basis. Stream processing structures such as queues can accommodate large volumes of updates (enqueue and dequeue); however, they are not efficient for fast retrieval. This paper proposes FLIRT, a parameter-free index structure that manages a sliding window over a high-velocity stream of data and simultaneously supports efficient range queries on the sliding window. FLIRT uses learned indexing to reduce the lookup time. This is enabled by organising the incoming stream of time-series data into linearly predictable segments, allowing fast queue operations such as enqueue, dequeue, and search. We further boost the search performance by introducing two multithreaded versions of FLIRT for different query workloads. Experimental results show up to 7 × speedup over conventional indexes, 8 × speedup over queues, and up to 109 × speedup over learned indexes.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"39 1","pages":"234-246"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85503539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Smart Derivative Contracts in DatalogMTL DatalogMTL中的智能衍生品合约
Andrea Colombo, Luigi Bellomarini, S. Ceri, Eleonora Laurenza
{"title":"Smart Derivative Contracts in DatalogMTL","authors":"Andrea Colombo, Luigi Bellomarini, S. Ceri, Eleonora Laurenza","doi":"10.48786/edbt.2023.65","DOIUrl":"https://doi.org/10.48786/edbt.2023.65","url":null,"abstract":"","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"1 1","pages":"773-781"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89342053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GAM Forest Explanation GAM森林解说
C. Lucchese, S. Orlando, R. Perego, Alberto Veneri
{"title":"GAM Forest Explanation","authors":"C. Lucchese, S. Orlando, R. Perego, Alberto Veneri","doi":"10.48786/edbt.2023.14","DOIUrl":"https://doi.org/10.48786/edbt.2023.14","url":null,"abstract":"Most accurate machine learning models unfortunately produce black-box predictions, for which it is impossible to grasp the internal logic that leads to a specific decision. Unfolding the logic of such black-box models is of increasing importance, especially when they are used in sensitive decision-making processes. In this work we focus on forests of decision trees, which may include hundreds to thousands of decision trees to produce accurate predictions. Such complexity raises the need of developing explanations for the predictions generated by large forests. We propose a post hoc explanation method of large forests, named GAM-based Explanation of Forests (GEF), which builds a Generalized Additive Model (GAM) able to explain, both locally and globally, the impact on the predictions of a limited set of features and feature interactions. We evaluate GEF over both synthetic and real-world datasets and show that GEF can create a GAM model with high fidelity by analyzing the given forest only and without using any further information, not even the initial training dataset.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"35 1","pages":"171-182"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80735581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and Efficient Update Handling for Graph H2TAP 快速和有效的更新处理图H2TAP
M. Jibril, Hani Al-Sayeh, Alexander Baumstark, K. Sattler
{"title":"Fast and Efficient Update Handling for Graph H2TAP","authors":"M. Jibril, Hani Al-Sayeh, Alexander Baumstark, K. Sattler","doi":"10.48786/edbt.2023.60","DOIUrl":"https://doi.org/10.48786/edbt.2023.60","url":null,"abstract":"Offloading graph analytics to GPU yields significant performance speedups. In heterogeneous hybrid transactional/analytical graph processing (graph H 2 TAP), where each graph workload type is executed on the most suitable processor, transactions are executed on a CPU-based main graph and analytics are executed on a GPU-optimized graph replica. The problem that arises, as a result, is that updates by transactions on the main graph have to be particularly handled with respect to the graph replica. In this paper, we present a fast and efficient approach to this update handling problem, based on a delta store optimized for graphs. The delta store is a differential graph store that captures the transactional updates, which are later propagated to the graph replica so that analytical queries are executed on the most recently committed version of the graph in accordance with freshness requirements. Our approach ensures consistency be-tween the main graph and the replica. Our evaluation shows the performance advantage of our approach over existing HTAP approaches.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"38 1","pages":"723-736"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75037683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EGG-SynC: Exact GPU-parallelized Grid-based Clustering by Synchronization EGG-SynC:基于同步的精确gpu并行网格聚类
Jakob Rødsgaard Jørgensen, I. Assent
{"title":"EGG-SynC: Exact GPU-parallelized Grid-based Clustering by Synchronization","authors":"Jakob Rødsgaard Jørgensen, I. Assent","doi":"10.48786/edbt.2023.16","DOIUrl":"https://doi.org/10.48786/edbt.2023.16","url":null,"abstract":"Clustering by synchronization (SynC) is a clustering method that is motivated by the natural phenomena of synchronization and is based on the Kuramoto model. The idea is to iteratively drag similar objects closer to each other until they have synchronized. SynC has been adapted to solve several well-known data mining tasks such as subspace clustering, hierarchical clustering, and streaming clustering. This shows that the SynC model is very versatile. Sadly, SynC has an 𝑂 ( 𝑇 × 𝑛 2 × 𝑑 ) complexity, which makes it impractical for larger datasets. E.g., Chen et al. [8] show runtimes of more than 10 hours for just 𝑛 = 70 , 000 data points, but improve this to just above one hour by using R-Trees in their method FSynC. Both are still impractical in real-life scenarios. Furthermore, SynC uses a termination criterion that brings no guarantees that the points have synchronized but instead just stops when most points are close to synchronizing. In this paper, our contributions are manifold. We propose a new termination criterion that guarantees that all points have synchronized. To achieve a much-needed reduction in runtime, we propose a strategy to summarize partitions of the data into a grid structure, a GPU-friendly grid structure to support this and neighborhood queries, and a GPU-parallelized algorithm for clustering by synchronization (EGG-SynC) that utilize these ideas. Furthermore, we provide an extensive evaluation against state-of-the-art showing 2 to 3 orders of magnitude speedup compared to SynC and FSynC.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"27 1","pages":"195-207"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75089729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Desis: Efficient Window Aggregation in Decentralized Networks 分布式网络中的高效窗口聚合
W. Yue, Lawrence Benson, T. Rabl
{"title":"Desis: Efficient Window Aggregation in Decentralized Networks","authors":"W. Yue, Lawrence Benson, T. Rabl","doi":"10.48786/edbt.2023.52","DOIUrl":"https://doi.org/10.48786/edbt.2023.52","url":null,"abstract":"Stream processing is widely applied in industry as well as in research to process unbounded data streams. In many use cases, specific data streams are processed by multiple continuous queries. Current systems group events of an unbounded data stream into bounded windows to produce results of individual queries in a timely fashion. For multiple concurrent queries, multiple concurrent and usually overlapping windows are generated. To reduce redundant computations and share partial results, state-of-the-art solutions divide windows into slices and then share the results of those slices. However, this is only applicable for queries with the same aggregation function and window measure, as in the case of overlaps for sliding windows. For multiple queries on the same stream with different aggregation functions and window measures, partial results cannot be shared. Furthermore, data streams are produced from devices that are distributed in large decentralized networks. Current systems cannot process queries on decentralized data streams efficiently. All queries in a decentralized network are either computed centrally or processed individually without exploiting partial results across queries. We present Desis, a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries. Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"2 1","pages":"618-631"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78974443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Demonstrating Interactive SPARQL Formulation through Positive and Negative Examples and Feedback 通过正负例子和反馈演示交互式SPARQL公式
Akritas Akritidis, Yannis Tzitzikas
{"title":"Demonstrating Interactive SPARQL Formulation through Positive and Negative Examples and Feedback","authors":"Akritas Akritidis, Yannis Tzitzikas","doi":"10.48786/edbt.2023.71","DOIUrl":"https://doi.org/10.48786/edbt.2023.71","url":null,"abstract":"The formulation of structured queries in Knowledge Graphs is a challenging task since it presupposes familiarity with the syntax of the query language and the contents of the knowledge graph. To alleviate this problem, for enabling plain users to formulate SPARQL queries, and advanced users to formulate queries with less effort, in this paper we introduce a novel method for “SPARQL by Example\". According to this method the user points to positive/negative entities, the system computes one query that describes these entities, and then the user refines the query interactively by providing positive/negative feedback on entities and suggested constraints. We shall demonstrate SPARQL-QBE , a tool that implements this approach, and we will briefly refer to the results of a task-based evaluation with users that provided positive evidence about the usability of the approach.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"49 1","pages":"811-814"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86139878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning over Sets for Databases 学习数据库的集合
Angjela Davitkova, Damjan Gjurovski, S. Michel
{"title":"Learning over Sets for Databases","authors":"Angjela Davitkova, Damjan Gjurovski, S. Michel","doi":"10.48786/edbt.2024.07","DOIUrl":"https://doi.org/10.48786/edbt.2024.07","url":null,"abstract":"In this work, we consider using deep learning models over a collection of sets to replace traditional approaches utilized in database systems. We propose solutions for data indexing, membership queries, and cardinality estimation. Unlike relational data, learned models over sets need to be permutation invariant and able to deal with variable set sizes. The proposed models are based on the DeepSets architecture and include per-element compression to achieve acceptable accuracy with modest model sizes. We further suggest a hybrid structure with bounded error guarantees using guided learning to mitigate the inherent challenges when working with set data. We outline challenges and opportunities when dealing with set data and demonstrate the suitability of the models through extensive experimental evaluation with one synthetic and two real-world datasets.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"45 1","pages":"68-80"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72821433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Patched Multi-Key Partitioning for Robust Query Performance 补丁多键分区鲁棒查询性能
Steffen Kläbe, K. Sattler
{"title":"Patched Multi-Key Partitioning for Robust Query Performance","authors":"Steffen Kläbe, K. Sattler","doi":"10.48786/edbt.2023.26","DOIUrl":"https://doi.org/10.48786/edbt.2023.26","url":null,"abstract":"Data partitioning is the key for parallel query processing in modern analytical database systems. Choosing the right partitioning key for a given dataset is a difficult task and crucial for query performance. Real world data warehouses contain a large amount of tables connected in complex schemes resulting in an over-whelming amount of partition key candidates. In this paper, we present the approach of patched multi-key partitioning, allowing to define multiple partition keys simultaneously without data replication. The key idea is to map the relational table partitioning problem to a graph partition problem in order to use existing graph partitioning algorithms to find connectivity components in the data and maintain exceptions (patches) to the partitioning separately. We show that patched multi-key partitioning offer opportunities for achieving robust query performance, i.e. reaching reasonably good performance for many queries instead of optimal performance for only a few queries.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"9 1","pages":"324-336"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74353789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信