Advances in database technology : proceedings. International Conference on Extending Database Technology最新文献

筛选
英文 中文
A Formal Design Framework for Practical Property Graph Schema Languages 实用属性图模式语言的形式化设计框架
Nimo Beeren, G. Fletcher
{"title":"A Formal Design Framework for Practical Property Graph Schema Languages","authors":"Nimo Beeren, G. Fletcher","doi":"10.48786/edbt.2023.40","DOIUrl":"https://doi.org/10.48786/edbt.2023.40","url":null,"abstract":"Graph databases are increasingly receiving attention from industry and academia, due in part to their flexibility; a schema is often not required. However, schemas can significantly benefit query optimization, data integrity, and documentation. There currently does not exist a formal framework which captures the design space of state-of-the-art schema solutions. We present a formal design framework for property graph schema languages based on first-order logic rules, which balances expressivity and practicality. We show how this framework can be adapted to integrate a core set of constraints common in conceptual data modeling methods. To demonstrate practical feasibility, this model is imple-mented using graph queries for modern graph database systems, which we evaluate through a controlled experiment. We find that validation time scales linearly with the size of the data, while only using unoptimized straightforward implementations.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"19 1","pages":"478-484"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81536390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Narration for the People: Challenges and Opportunities 面向人民的数据叙事:挑战与机遇
S. Amer-Yahia, Patrick Marcel, Verónika Peralta
{"title":"Data Narration for the People: Challenges and Opportunities","authors":"S. Amer-Yahia, Patrick Marcel, Verónika Peralta","doi":"10.48786/edbt.2023.82","DOIUrl":"https://doi.org/10.48786/edbt.2023.82","url":null,"abstract":"Data narration is the process of telling stories with insights ex-tracted from data. It is an instance of data science [4] where the pipeline focuses on data collection and exploration, answering questions, structuring answers, and finally presenting them to stakeholders [16, 17]. This tutorial reviews the challenges and opportunities of the full and semi-automation of these steps. In doing so, it draws from the extensive literature in data narration, data exploration and data visualization. In particular, we point out key theoretical and practical contributions in each domain such as next-step recommendation and policy learning for data exploration, insight interestingness and evaluation frameworks, and the crafting of data stories for the people who will exploit them. We also identify topics that are still worth investigating, such as the inclusion of different stakeholders’ profiles in designing data pipelines with the goal of providing data narration for all.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"56 1","pages":"855-858"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84774846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models 通过探索预训练语言模型的力量从大量文本中挖掘结构
Yu Zhang, Yunyi Zhang, Jiawei Han
{"title":"Mining Structures from Massive Texts by Exploring the Power of Pre-trained Language Models","authors":"Yu Zhang, Yunyi Zhang, Jiawei Han","doi":"10.48786/edbt.2023.81","DOIUrl":"https://doi.org/10.48786/edbt.2023.81","url":null,"abstract":"Technologies for handling massive structured or semi-structured data have been researched extensively in database communities. However, the real-world data are largely in the form of unstructured text, posing a great challenge to their management and analysis as well as their integration with semi-structured databases. Recent developments of deep learning methods and large pre-trained language models (PLMs) have revolutionized text mining and processing and shed new light on structuring massive text data and building a framework for integrated (i.e., structured and unstructured) data management and analysis. In this tutorial, we will focus on the recently developed text mining approaches empowered by PLMs that can work without relying on heavy human annotations. We will present an organized picture of how a set of weakly supervised methods explore the power of PLMs to structure text data, with the following outline: (1) an introduction to pre-trained languagemodels that serve as new tools for our tasks, (2) mining topic structures: unsupervised and seed-guided methods for topic discovery from massive text corpora, (3) mining document structures: weakly supervised methods for text classification, (4) mining entity structures: distantly supervised and weakly supervised methods for phrase mining, named entity recognition, taxonomy construction, and structured knowledge graph construction, and (5) towards an integrated information processing paradigm. 1 BACKGROUND, GOALS, AND DURATION The massive text data available on the Web, social media, news, scientific literature, government reports, and other information sources contain rich knowledge that can potentially benefit a wide variety of information processing tasks, and they can be potentially structured and analyzed by extended database technologies. For example, one can conduct entity recognition and concept ontology construction on a large collection of scientific papers and extract the factual knowledge for knowledge base construction and subsequent analysis. How to effectively leverage the unstructured massive text data for downstream applications has remained an important and active research question for the past few decades. Recently, pre-trained language models (PLMs) such as BERT [6] have revolutionized the text mining field and brought new inspirations to structuring text data. To be specific, the following paradigm is usually adopted: pre-training neural architectures on large-scale text corpora obtained from the world knowledge (e.g., a combination of Wikipedia, books, scientific corpora, and web content), and then transferring their representations to task-specific data. By doing so, the knowledge encoded in the world corpora can be effectively leveraged to enhance © 2023 Copyright held by the owner/author(s). Published in Proceedings of the 26th International Conference on Extending Database Technology (EDBT), 28th March-31st March, 2023, ISBN 978-3-89318-092-9 on OpenProceedings.org. ","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"108 1","pages":"851-854"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75928134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RDF-Analytics: Interactive Analytics over RDF Knowledge Graphs RDF-Analytics:基于RDF知识图的交互式分析
Maria-Evangelia Papadaki, Yannis Tzitzikas
{"title":"RDF-Analytics: Interactive Analytics over RDF Knowledge Graphs","authors":"Maria-Evangelia Papadaki, Yannis Tzitzikas","doi":"10.48786/edbt.2023.70","DOIUrl":"https://doi.org/10.48786/edbt.2023.70","url":null,"abstract":"The formulation of structured queries in knowledge graphs is a challenging task that presupposes familiarity with the syntax of the query language and the contents of the knowledge graph. To alleviate this difficulty in this paper we introduce RDF-ANALYTICS , a novel system that enables plain users to formulate analytic queries over complex, i.e. not necessarily star-schema based, RDF knowledge graphs. To come up with an intuitive interface, we leverage the familiarity of users with Faceted Search (FS) systems, i.e. we extend FS with actions that enable users to formulate analytic queries, too. Distinctive characteristics of the approach is the ability to include arbitrarily long paths in the analytic query (accompanied with count information), interactive formulation of HAVING restrictions, the support of both Faceted Search (i.e. the locating of the desired resources in a faceted search manner) and analytic queries, and the ability to formulate nested analytic queries. Finally, we present the results of a preliminary task-based evaluation with users, which are very promising.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"323 1","pages":"807-810"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76296786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Provenance for SHACL 用于acl的数据来源
Thomas Delva, Maxim Jakubowski
{"title":"Data Provenance for SHACL","authors":"Thomas Delva, Maxim Jakubowski","doi":"10.48786/edbt.2023.23","DOIUrl":"https://doi.org/10.48786/edbt.2023.23","url":null,"abstract":"In constraint languages for RDF graphs, such as ShEx and SHACL, constraints on nodes and their properties are known as “shapes”. Using SHACL, we propose in this paper the notion of neighborhood of a node 𝑣 satisfying a given shape in a graph 𝐺 . This neighborhood is a subgraph of 𝐺 , and provides data provenance of 𝑣 for the given shape. We establish a correctness property for the obtained provenance mechanism, by proving that neighborhoods adhere to the Sufficiency requirement articulated for provenance semantics for database queries. As an additional benefit, neighborhoods allow a novel use of shapes: the extraction of a subgraph from an RDF graph, the so-called shape fragment. We compare shape fragments with SPARQL queries. We discuss implementation strategies for computing neighborhoods, and present initial experiments demonstrating that our ideas are fea-sible.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"16 1","pages":"285-297"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90343818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stitcher: Learned Workload Synthesis from Historical Performance Footprints 缝制工:从历史性能足迹中学习工作量合成
Chengcheng Wan, Yiwen Zhu, Joyce Cahoon, Wenjing Wang, K. Lin, Sean Liu, Raymond Truong, Neetu Singh, Alexandra Ciortea, Konstantinos Karanasos, Subru Krishnan
{"title":"Stitcher: Learned Workload Synthesis from Historical Performance Footprints","authors":"Chengcheng Wan, Yiwen Zhu, Joyce Cahoon, Wenjing Wang, K. Lin, Sean Liu, Raymond Truong, Neetu Singh, Alexandra Ciortea, Konstantinos Karanasos, Subru Krishnan","doi":"10.48786/edbt.2023.33","DOIUrl":"https://doi.org/10.48786/edbt.2023.33","url":null,"abstract":"Database benchmarking and workload replay have been widely used to drive system design, evaluate workload performance, de-termine product evolution, and guide cloud migration. However, they both suffer from some key limitations: the former fails to capture the variety and complexity of production workloads; the latter requires access to user data, queries, and machine specifications, deeming it inapplicable in the face of user privacy concerns. Here we introduce our vision of learned workload synthesis to overcome these issues: given the performance profile of a customer workload (e.g., CPU/memory counters), synthesize a new workload that yields the same performance profile when executed on a range of hardware/software configurations. We present Stitcher as a first step towards realizing this vision, which synthesizes workloads by combining pieces from standard benchmarks. We believe that our vision will spark new research avenues in database workload replay.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"108 1","pages":"417-423"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91107488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding crowd energy consumption behaviors 了解人群能源消耗行为
X. Liu, Xu Cheng, Yanyan Yang, Huan Huo, Yongping Liu, P. S. Nielsen
{"title":"Understanding crowd energy consumption behaviors","authors":"X. Liu, Xu Cheng, Yanyan Yang, Huan Huo, Yongping Liu, P. S. Nielsen","doi":"10.48786/edbt.2023.68","DOIUrl":"https://doi.org/10.48786/edbt.2023.68","url":null,"abstract":"Understanding crowd behavior is crucial for energy demand-side management. In this paper, we employ the fluid dynamics concept potential flow to model the energy demand shift patterns of the crowd in both temporal and spatial dimensions. To facilitate the use of the proposed method, we implement a visual analysis platform that allows users to interactively explore and interpret the shift patterns. The effectiveness of the proposed method will be evaluated through a hands-on experience with a real case study during the conference demonstration.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"6 1","pages":"799-802"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91288051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pushing Edge Computing one Step Further: Resilient and Privacy-Preserving Processing on Personal Devices 进一步推动边缘计算:个人设备上的弹性和隐私保护处理
Ludovic Javet, N. Anciaux, Luc Bouganim, Léo Lamoureux, P. Pucheral
{"title":"Pushing Edge Computing one Step Further: Resilient and Privacy-Preserving Processing on Personal Devices","authors":"Ludovic Javet, N. Anciaux, Luc Bouganim, Léo Lamoureux, P. Pucheral","doi":"10.48786/edbt.2023.77","DOIUrl":"https://doi.org/10.48786/edbt.2023.77","url":null,"abstract":"Can we push Edge computing one step further? This demonstration paper proposes an answer to this question by leveraging the generalization of Trusted Execution Environments at the very edge of the network to enable resilient and privacy-preserving computation on personal devices. Based on preliminary published results, we show that this can drastically change the way distributed processing over personal data is conceived and achieved. The platform presented here demonstrates the pertinence of the approach through execution scenarios integrating heterogeneous secure personal devices.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"46 1","pages":"835-838"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90898008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REQUIRED: A Tool to Relax Queries through Relaxed Functional Dependencies 需要:一个通过放松的功能依赖来放松查询的工具
Loredana Caruccio, Stefano Cirillo, V. Deufemia, G. Polese, R. Stanzione
{"title":"REQUIRED: A Tool to Relax Queries through Relaxed Functional Dependencies","authors":"Loredana Caruccio, Stefano Cirillo, V. Deufemia, G. Polese, R. Stanzione","doi":"10.48786/edbt.2023.74","DOIUrl":"https://doi.org/10.48786/edbt.2023.74","url":null,"abstract":"Query relaxation aims to relax the query constraints in order to derive some approximate results when the answer set is small. In this demo paper, we present REQUIRED, an automatized, portable, and scalable query relaxation tool leveraging metadata learned from an input dataset. The intuition is to use relationships underlying attribute values to derive a new query whose approximate results still meet the user’s expectations. In particular, REQUIRED exploits relaxed functional dependencies to modify the original query in two different ways: ( 𝑖 ) relaxing some query conditions by replacing the equality constraints with ranges and/or collections of admissible values, and ( 𝑖𝑖 ) rewriting the original query by replacing some or all the attributes involved in the conditions of the query with attributes related to them. Our demonstration scenarios show that REQUIRED is effective in properly relaxing queries according to the considered strategy.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"10 1","pages":"823-826"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86751714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Multi-Model Management 高效的多模式管理
Nils Strassenburg, Dominic Kupfer, J. Kowal, T. Rabl
{"title":"Efficient Multi-Model Management","authors":"Nils Strassenburg, Dominic Kupfer, J. Kowal, T. Rabl","doi":"10.48786/edbt.2023.37","DOIUrl":"https://doi.org/10.48786/edbt.2023.37","url":null,"abstract":"Deep learning models are deployed in an increasing number of industrial domains, such as retail and automotive applications. An instance of a model typically performs one specific task, which is why larger software systems use multiple models in parallel. Given that all models in production software have to be managed, this leads to the problem of managing sets of related models, i.e., multi-model management. Existing approaches perform poorly on this task because they are optimized for saving single large models but not for simultaneously saving a set of related models. In this paper, we explore the space of multi-model management by presenting three optimized approaches: (1) A baseline approach that saves full model representations and minimizes the amount of saved metadata. (2) An update approach that reduces the storage consumption compared to the baseline by saving parameter updates instead of full models. (3) A provenance approach that saves model provenance data instead of model parameters. We evaluate the approaches for the multi-model management use cases of managing car battery cell models and image classification models. Our results show that the baseline outperforms existing approaches for save and recover times by more than an order of magnitude and that more sophisticated approaches reduce the storage consumption by up to 99%.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"77 1","pages":"457-463"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86764458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信