Proceedings of the 21st International Database Engineering & Applications Symposium最新文献_第2页

Secure Range Query Processing over Untrustworthy Cloud Services 不可信云服务上的安全范围查询处理

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105872

T. Tzouramanis

引用次数: 2

Evaluating SQL-on-Hadoop for Big Data Warehousing on Not-So-Good Hardware 评估SQL-on-Hadoop在不太好的硬件上的大数据仓库

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105842

M. Y. Santos, Carlos A. Costa, João Galvão, Carina Andrade, Bruno Martinho, F. V. Lima, Eduarda Costa

{"title":"Evaluating SQL-on-Hadoop for Big Data Warehousing on Not-So-Good Hardware","authors":"M. Y. Santos, Carlos A. Costa, João Galvão, Carina Andrade, Bruno Martinho, F. V. Lima, Eduarda Costa","doi":"10.1145/3105831.3105842","DOIUrl":"https://doi.org/10.1145/3105831.3105842","url":null,"abstract":"Big Data is currently conceptualized as data whose volume, variety or velocity impose significant difficulties in traditional techniques and technologies. Big Data Warehousing is emerging as a new concept for Big Data analytics. In this context, SQL-on-Hadoop systems increased notoriety, providing Structured Query Language (SQL) interfaces and interactive queries on Hadoop. A benchmark based on a denormalized version of the TPC-H is used to compare the performance of Hive on Tez, Spark, Presto and Drill. Some key contributions of this work include: the direct comparison of a vast set of technologies; unlike previous scientific works, SQL-on-Hadoop systems were connected to Hive tables instead of raw files; allow to understand the behaviour of these systems in scenarios with ever-increasing requirements, but not-so-good hardware. Besides these benchmark results, this paper also makes available interesting findings regarding an architecture and infrastructure in SQL-on-Hadoop for Big Data Warehousing, helping practitioners and fostering future research.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125866341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 19

Comparison of Dynamic Itemset Mining Algorithms for Multiple Support Thresholds 多支持阈值的动态项集挖掘算法比较

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105846

Nourhan Abuzayed, B. Ergenç

引用次数: 3

Effective Big Data Visualization 有效的大数据可视化

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105857

Murali Mani, Si Fei

引用次数: 10

How Project-management-tools are used in Agile Practice: Benefits, Drawbacks and Potentials 如何在敏捷实践中使用项目管理工具:好处、缺点和潜力

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105865

Florian Raith, Ingo Richter, Robert Lindermeier

{"title":"How Project-management-tools are used in Agile Practice: Benefits, Drawbacks and Potentials","authors":"Florian Raith, Ingo Richter, Robert Lindermeier","doi":"10.1145/3105831.3105865","DOIUrl":"https://doi.org/10.1145/3105831.3105865","url":null,"abstract":"The use of agile methodologies is quite common in distributed software development. To facilitate sharing of project relevant information across distributed agile teams, project-management-tools (e.g. Jira, Youtrack) are commonly used. Literature and practice show drawbacks in teamwork and communication when those mainly browser-based tools are used instead of traditional paper-based media. Improvements in this area are possible if we acquire knowledge of how these tools are used in practice and what problems or challenges need to be addressed in the future. Therefore we conducted an exploratory semi-structured interview study investigating in what manner common project-management-tools are used in the particular phases and meetings of an agile development process. We have interviewed five experienced agile coaches that guided projects with different constellations regarding the number of involved agile teams and their locations. As a first result we summarized and structured our findings according to all steps of an agile process in the form of textual descriptions and diagrams. Thereby we focused on the combined usage of digitial project-management-tools and traditional paper-based media. As a second outcome we listed benefits, drawbacks and potential improvements of digital project-management-tools in agile software development from the interviewees' points of view.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122083846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

DiPCoDing: A Differentially Private Approach for Correlated Data with Clustering DiPCoDing:具有聚类的相关数据的差分私有方法

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105861

André L. C. Mendonça, Felipe T. Brito, L. S. Linhares, Javam C. Machado

{"title":"DiPCoDing: A Differentially Private Approach for Correlated Data with Clustering","authors":"André L. C. Mendonça, Felipe T. Brito, L. S. Linhares, Javam C. Machado","doi":"10.1145/3105831.3105861","DOIUrl":"https://doi.org/10.1145/3105831.3105861","url":null,"abstract":"Differential privacy is a model which gives strong privacy guarantees. It was designed to make difficult to distinguish individuals' records on statistical databases while maximizing data utility. Differential privacy approaches usually assume that database records are sampled independently, i.e., each record of this database is independent of the rest. However, this assumption is not always true in the context of real-world applications. In this paper we propose DiPCoDing, a novel approach to calculate the correlation between records in statistical databases using clusterization. For this matter, we have considered Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Gaussian Mixture Model (GMM). Our method aims to group similar records, which are more likely to be correlated, to reduce the sensitivity of differential privacy and consequently the amount of noise added to the query answer, increasing data utility while providing privacy for correlated data. The experimental results of our approach showed that relative errors and noisy answers are significantly lower than those from existing works.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121005108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Computing a Deterministic Semantics for P2P Deductive Databases P2P演绎数据库的确定性语义计算

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105837

Luciano Caroprese, E. Zumpano

{"title":"Computing a Deterministic Semantics for P2P Deductive Databases","authors":"Luciano Caroprese, E. Zumpano","doi":"10.1145/3105831.3105837","DOIUrl":"https://doi.org/10.1145/3105831.3105837","url":null,"abstract":"This paper proposes a logic based framework for data integration and query answering for deductive databases in a P2P environment. It is based on a special interpretation of mapping rules that leads to a declarative semantics for P2P systems defined in terms of preferred weak models. Under this semantics, only facts not making the local databases inconsistent can be imported, and the preferred weak models are the consistent scenarios in which peers import, by means of mapping rules, maximal sets of facts not violating (directly or indirectly) integrity constraints. The preferred weak models can be computed by means of a rewriting technique allowing to model a P2P system as a unique logic program whose stable models correspond to its preferred weak models. In the general case a P2P system may admit many preferred weak models and it has been shown that the complexity of their computation is prohibitive. Therefore, the paper looks for a more pragmatic solution assigning to a P2P system a new and more suitable semantics: the Well Founded Model Semantics. It allows to obtain a deterministic model whose computation is polynomial time. This model is a (partial) stable model obtained by evaluating with a three-value semantics the normal version of the rewriting of the P2P system. Finally, a distributed algorithm for the computation of the well founded model is proposed.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124045446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Constrained Hierarchical Clustering for News Events 新闻事件的约束层次聚类

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105859

R. Florence, Bruno M. Nogueira, R. Marcacini

{"title":"Constrained Hierarchical Clustering for News Events","authors":"R. Florence, Bruno M. Nogueira, R. Marcacini","doi":"10.1145/3105831.3105859","DOIUrl":"https://doi.org/10.1145/3105831.3105859","url":null,"abstract":"Knowledge discovery from web news events has received great attention in recent years. In practice, this knowledge is a digital representation (virtual world) of various phenomena that occur in our physical world. Hierarchical clustering algorithms are used to organize related events into groups and subgroups according to some similarity measure. The main motivation for this organization is based on the hypothesis that if the user is interested in a specific event of a certain cluster, then the user may also be interested in other related events of this same cluster. However, existing event clustering methods do not effectively use the different types of information about events, such as temporal information, geographical data, name of people and organizations. In this paper, we propose the COH-KMeans algorithm (Constrained Hierarchical K-Means) that obtains a hierarchical clustering structure considering certain conditions imposed by the users, for example, events of similar content that occurred in nearby geographic locations or that occurred within a predefined time window. A statistical analysis of the experimental results reveals that the incorporation of constraints performed by COH-KMeans allows to obtain higher quality clusters when compared to a state-of-the-art unsupervised hierarchical clustering method. Moreover, we present our tool for exploratory analysis of events and we discuss how event clustering can be used to support the decision-making process from the perspective of a Data Analytics System.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"514 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132622036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

BDgen: A Universal Big Data Generator BDgen:通用大数据生成器

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105847

Tomás Faltín, Michal Hanzeli, Vojtech Sípek, Jan Skvaril, Dusan Varis, Irena Holubová Mlýnková

引用次数: 0

Using a Model-driven Approach in Building a Provenance Framework for Tracking Policy-making Processes in Smart Cities 使用模型驱动的方法构建溯源框架以跟踪智慧城市的决策过程

Proceedings of the 21st International Database Engineering & Applications Symposium Pub Date : 2017-07-12 DOI: 10.1145/3105831.3105849

Barkha Javed, Z. Khan, R. McClatchey

引用次数: 2