Advances in database technology : proceedings. International Conference on Extending Database Technology最新文献_第6页

GraphTempo: An aggregation framework for evolving graphs GraphTempo:用于演化图的聚合框架

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.18

Evangelia Tsoukanara, Georgia Koloniari, E. Pitoura

引用次数: 3

Workload-Aware Query Recommendation Using Deep Learning 基于深度学习的工作负载感知查询推荐

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.05

E. Y. Lai, Zainab Zolaktaf, Mostafa Milani, Omar AlOmeir, Jianhao Cao, R. Pottinger

引用次数: 0

Spatial Structure-Aware Road Network Embedding via Graph Contrastive Learning 基于图对比学习的空间结构感知道路网络嵌入

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.12

Yanchuan Chang, E. Tanin, Xin Cao, Jianzhong Qi

{"title":"Spatial Structure-Aware Road Network Embedding via Graph Contrastive Learning","authors":"Yanchuan Chang, E. Tanin, Xin Cao, Jianzhong Qi","doi":"10.48786/edbt.2023.12","DOIUrl":"https://doi.org/10.48786/edbt.2023.12","url":null,"abstract":"Road networks are widely used as a fundamental structure in urban transportation studies. In recent years, with more research leveraging deep learning to solve conventional transportation problems, how to obtain robust road network representations (i.e., embeddings) applicable for a wide range of applications became a fundamental need. Existing studies mainly adopt graph embedding methods. Such methods, however, foremost learn the topological correlations of road networks but ignore the spatial structure (i.e., spatial correlations) which are also important in applications such as querying similar trajectories. Besides, most studies learn task-specic embeddings in a supervised manner such that the embeddings are sub-optimal when being used for new tasks. It is inecient to store or learn dedicated embeddings for every dierent task in a large transportation system. To tackle these issues, we propose a model named SARN to learn generic and task-agnostic road network embeddings based on self-supervised contrastive learning. We present (i) a spatial similarity matrix to help learn the spatial correlations of the roads, (ii) a sampling strategy based on the spatial structure of a road network to form self-supervised training samples, and (iii) a two-level loss function to guide SARN to learn embeddings based on both local and global contrasts of similar and dissimilar road segments. Experimental results on three downstream tasks over real-world road networks show that SARN outperforms state-of-the-art self-supervised models consistently and achieves comparable (or even better) performance to supervised models.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"46 1","pages":"144-156"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84432765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

A new PET for Data Collection via Forms with Data Minimization, Full Accuracy and Informed Consent 一个新的PET通过数据最小化，完全准确和知情同意的表格收集数据

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2024.08

N. Anciaux, S. Frittella, Baptiste Joffroy, Benjamin Nguyen, Guillaume Scerri

{"title":"A new PET for Data Collection via Forms with Data Minimization, Full Accuracy and Informed Consent","authors":"N. Anciaux, S. Frittella, Baptiste Joffroy, Benjamin Nguyen, Guillaume Scerri","doi":"10.48786/edbt.2024.08","DOIUrl":"https://doi.org/10.48786/edbt.2024.08","url":null,"abstract":"The advent of privacy laws and principles such as data minimization and informed consent are supposed to protect citizens from over-collection of personal data. Nevertheless, current processes, mainly through filling forms are still based on practices that lead to over-collection. Indeed, any citizen wishing to apply for a benefit (or service) will transmit all their personal data involved in the evaluation of the eligibility criteria. The resulting problem of over-collection affects millions of individuals, with considerable volumes of information collected. If this problem of compliance concerns both public and private organizations (e.g., social services, banks, insurance companies), it is because it faces non-trivial issues, which hinder the implementation of data minimization by developers. In this paper, we propose a new modeling approach that enables data minimization and informed choices for the users, for any decision problem modeled using classical logic, which covers a wide range of practical cases. Our data minimization solution uses game theoretic notions to explain and quantify the privacy payoff for the user. We show how our algorithms can be applied to practical cases study as a new PET for minimal, fully accurate (all due services must be preserved) and informed data collection.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"57 1","pages":"81-93"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80497929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Density-Based Geometry Compression for LiDAR Point Clouds 基于密度的激光雷达点云几何压缩

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.30

Xibo Sun, Qiong Luo

{"title":"Density-Based Geometry Compression for LiDAR Point Clouds","authors":"Xibo Sun, Qiong Luo","doi":"10.48786/edbt.2023.30","DOIUrl":"https://doi.org/10.48786/edbt.2023.30","url":null,"abstract":"LiDAR (Light Detection and Ranging) sensors produce 3D point clouds that capture the surroundings, and these data are used in applications such as autonomous driving, tra � c monitoring, and remote surveys. LiDAR point clouds are usually compressed for e � cient transmission and storage. However, to achieve a high compression ratio, existing work often sacri � ces the geometric accuracy of the data, which hurts the e � ectiveness of downstream applications. Therefore, we propose a system that achieves a high compression ratio while preserving geometric accuracy. In our method, we � rst perform density-based clustering to distinguish the dense points from the sparse ones, because they are suitable for di � erent compression methods. The clustering algorithm is optimized for our purpose and its parameter values are set to preserve accuracy. We then compress the dense points with an octree, and organize the sparse ones into polylines to reduce the redundancy. We further propose to compress the sparse points on the polylines by their spherical coordinates considering the properties of both the LiDAR sensors and the real-world scenes. Finally, we design suitable schemes to compress the remaining sparse points not on any polyline. Experimental results on DBGC, our prototype system, show that our scheme compressed large-scale real-world datasets by up to 19 times with an error bound under 0.02 meters for scenes of thousands of cubic meters. This result, together with the fast compression speed of DBGC, demonstrates the online compression of LiDAR data with high accuracy. Our source code is publicly available at https://github.com/RapidsAtHKUST/DBGC.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"44 1","pages":"378-390"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83061379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Exploration of Approaches for In-Database ML 数据库内ML方法的探索

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.25

Steffen Kläbe, Stefan Hagedorn, K. Sattler

{"title":"Exploration of Approaches for In-Database ML","authors":"Steffen Kläbe, Stefan Hagedorn, K. Sattler","doi":"10.48786/edbt.2023.25","DOIUrl":"https://doi.org/10.48786/edbt.2023.25","url":null,"abstract":"Database systems are no longer used only for the storage of plain structured data and basic analyses. An increasing role is also played by the integration of ML models, e.g., neural networks with specialized frameworks, and their use for classification or prediction. However, using such models on data stored in a database system might require downloading the data and performing the computations outside. In this paper, we evaluate approaches for integrating the ML inference step as a special query operator - the ModelJoin. We explore several options for this integration on different abstraction levels: relational representation of the models as well as SQL queries for inference, the use of UDFs, the use of APIs to existing ML runtimes and a native implementation of the ModelJoin as a query operator supporting both CPU and GPU execution. Our evaluation results show that integrating ML runtimes over APIs perform similarly to a native operator while being generic to support arbitrary model types. The solution of relational representation and SQL queries is most portable and works well for smaller inputs without any changes needed in the database engine.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"17 1","pages":"311-323"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85281129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

New Trends in Time Series Anomaly Detection 时间序列异常检测的新趋势

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.80

Paul Boniol, John Paparizzos, Themis Palpanas

引用次数: 2

COVIDKG.ORG - a Web-scale COVID-19 Interactive, Trustworthy Knowledge Graph, Constructed and Interrogated for Bias using Deep-Learning COVIDKG。ORG -一个网络规模的COVID-19互动，可信赖的知识图谱，使用深度学习构建和询问偏见

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.63

Bhimesh Kandibedala, A. Pyayt, Nick Piraino, Chris Caballero, M. Gubanov

引用次数: 0

WedgeBlock: An Off-Chain Secure Logging Platform for Blockchain Applications WedgeBlock:区块链应用的链下安全日志平台

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.45

Abhishek A. Singh, Yinan Zhou, Mohammad Sadoghi, S. Mehrotra, Sharad Sharma, Faisal Nawab

引用次数: 0

Streaming Weighted Sampling over Join Queries 在连接查询上流式加权抽样

Advances in database technology : proceedings. International Conference on Extending Database Technology Pub Date : 2023-01-01 DOI: 10.48786/edbt.2023.24

Michael Shekelyan, Graham Cormode, Qingzhi Ma, A. Shanghooshabad, P. Triantafillou

{"title":"Streaming Weighted Sampling over Join Queries","authors":"Michael Shekelyan, Graham Cormode, Qingzhi Ma, A. Shanghooshabad, P. Triantafillou","doi":"10.48786/edbt.2023.24","DOIUrl":"https://doi.org/10.48786/edbt.2023.24","url":null,"abstract":"Join queries are a fundamental database tool, capturing a range of tasks that involve linking heterogeneous data sources. However, with massive table sizes, it is often impractical to keep these in memory, and we can only take one or few streaming passes over them. Moreover, building out the full join result (e.g., linking heterogeneous data sources along quasi-identifiers) can lead to a combinatorial explosion of results due to many-to-many links. Random sampling is a natural tool to boil this oversized result down to a representative subset with well-understood statistical properties, but turns out to be a challenging task due to the combinatorial nature of the sampling domain. Existing techniques in the literature focus solely on the setting with tabular data resid-ing in main memory, and do not address aspects such as stream operation, weighted sampling and more general join operators that are urgently needed in a modern data processing context. The main contribution of this work is to meet these needs with more lightweight practical approaches. First, a bijection between the sampling problem and a graph problem is introduced to support weighted sampling and common join operators. Second, the sampling techniques are refined to minimise the number of streaming passes. Third, techniques are presented to deal with very large tables under limited memory. Finally, the proposed techniques are compared to existing approaches that rely on database indices and the results indicate substantial memory savings, reduced runtimes for ad-hoc queries and competitive amortised runtimes. All pertinent code and data can be found at:","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"17 1","pages":"298-310"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84911541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0