Proceedings. International Database Engineering and Applications Symposium最新文献_第10页

Using an inference mechanism for helping the data integration 使用推理机制帮助数据集成

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076660

V. Pequeno, J. Pires

引用次数: 0

Scrubbing query results from probabilistic databases 从概率数据库中清除查询结果

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076634

Jianwen Chen, Ling Feng, Wenwei Xue

{"title":"Scrubbing query results from probabilistic databases","authors":"Jianwen Chen, Ling Feng, Wenwei Xue","doi":"10.1145/2076623.2076634","DOIUrl":"https://doi.org/10.1145/2076623.2076634","url":null,"abstract":"Queries over probabilistic databases lead to probabilistic results. As the process of arriving at these results is based on underlying data probabilities, we believe involving a user in the loop of query processing and leveraging the user's personal knowledge to deal with uncertain data, will enable the system to scrub (correct) and tailor its probabilistic query results towards a better quality from the perspective of the specific user. In this paper, we propose to open the black box of a probabilistic database query engine, and explain to the user how the engine comes up with the probabilistic query result as well as which uncertain tuples in the database the result is derived from. In this way, the user based on his/her knowledge about uncertain information can not only decide how much confidence to be placed on the query engine, but also help clarify some uncertain information so that the query engine can re-generate an improved query result. Two particular issues associated with such a probabilistic database query framework are addressed: (i) how to interact with a user for answer explanation and uncertainty clarification without bringing much burden to the user, and (ii) how to scrub/correct the query result without incurring much computation overhead to the query engine. Our performance study demonstrates the accuracy effectiveness and computational efficiency achieved by the proposed framework.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"20 1","pages":"79-87"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85670920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

FB-tree: a B+-tree for flash-based SSDs FB-tree:基于闪存的ssd的B+树

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076629

Martin V. Jørgensen, René Bech Rasmussen, Simonas Šaltenis, Carsten Schjønning

{"title":"FB-tree: a B+-tree for flash-based SSDs","authors":"Martin V. Jørgensen, René Bech Rasmussen, Simonas Šaltenis, Carsten Schjønning","doi":"10.1145/2076623.2076629","DOIUrl":"https://doi.org/10.1145/2076623.2076629","url":null,"abstract":"Due to their many advantages, flash-based SSDs (Solid-State Drives) have become a mainstream alternative to magnetic disks for database servers. Nevertheless, database systems, designed and optimized for magnetic disks, still do not fully exploit all the benefits of the new technology.\u0000 We propose the FB-tree: a combination of an adapted B+-tree, a storage manager, and a buffer manager, all optimized for modern SSDs. Together the techniques enable writing to SSDs in relatively large blocks, thus achieving greater overall throughput. This is achieved by the out-of-place writing, whereby every time a modified index node is written, it is written to a new address, clustered with some other nodes that are written together. While this constantly frees index nodes, the FB-tree does not introduce any garbage-collection overhead, instead relying on naturally occurring free-space segments of sufficient size. As a consequence, the FB-tree outperforms a regular B+-tree in all scenarios tested. For instance, the throughput of a random workload of 75% updates increases by a factor of three using only two times the space of the B+-tree.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"68 1","pages":"34-42"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91349463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

A predictable storage model for scalable parallel DW 可伸缩并行DW的可预测存储模型

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076628

J. Costa, J. Cecílio, P. Martins, P. Furtado

引用次数: 5

A family of graph-theory-driven algorithms for managing complex probabilistic graph data efficiently 一组有效管理复杂概率图数据的图论驱动算法

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076657

A. Cuzzocrea, Paolo Serafino

引用次数: 1

Aggregates and priorities in P2P data management systems P2P数据管理系统中的聚合和优先级

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076625

Luciano Caroprese, E. Zumpano

{"title":"Aggregates and priorities in P2P data management systems","authors":"Luciano Caroprese, E. Zumpano","doi":"10.1145/2076623.2076625","DOIUrl":"https://doi.org/10.1145/2076623.2076625","url":null,"abstract":"This paper investigates the data exchange problem among distributed independent sources. It is based on previous works of the authors [11, 12, 14] in which a declarative semantics for P2P systems has been presented and a mechanism to set different degrees of reliability for neighbor peers has been provided. The basic semantics for P2P systems defines the concept of Maximal Weak Models (in [11, 12, 14] these models have been called Preferred Weak Models. In this paper we rename them and use the term Preferred for the subclass of Weak Model defined here) that represent scenarios in which maximal sets of facts not violating integrity constraints are imported into the peers [11, 12]. Previous priority mechanism defined in [14] is rigid in the sense that the preference between conflicting sets of atoms that a peer can import only depends on the priorities associated to the source peers at design time. In this paper we present a different framework that allows to select among different scenarios looking at the properties of data provided by the peers. The framework presented here allows to model concepts like \"in the case of conflicting information, it is preferable to import data from the neighbor peer that can provide the maximum number of tuples\" or \"in the case of conflicting information, it is preferable to import data from the neighbor peer such that the sum of the values of an attribute is minimum\" without selecting a-priori preferred peers. To enforce this preference mechanism we enrich the previous P2P framework with aggregate functions and present significant examples showing the flexibility of the new framework.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"54 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90248729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Online outlier detection for data streams 数据流的在线异常值检测

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076635

Md. Shiblee Sadik, L. Gruenwald

{"title":"Online outlier detection for data streams","authors":"Md. Shiblee Sadik, L. Gruenwald","doi":"10.1145/2076623.2076635","DOIUrl":"https://doi.org/10.1145/2076623.2076635","url":null,"abstract":"Outlier detection is a well established area of statistics but most of the existing outlier detection techniques are designed for applications where the entire dataset is available for random access. A typical outlier detection technique constructs a standard data distribution or model and identifies the deviated data points from the model as outliers. Evidently these techniques are not suitable for online data streams where the entire dataset, due to its unbounded volume, is not available for random access. Moreover, the data distribution in data streams change over time which challenges the existing outlier detection techniques that assume a constant standard data distribution for the entire dataset. In addition, data streams are characterized by uncertainty which imposes further complexity. In this paper we propose an adaptive, online outlier detection technique addressing the aforementioned characteristics of data streams, called Adaptive Outlier Detection for Data Streams (A-ODDS), which identifies outliers with respect to all the received data points as well as temporally close data points. The temporally close data points are selected based on time and change of data distribution. We also present an efficient and online implementation of the technique and a performance study showing the superiority of A-ODDS over existing techniques in terms of accuracy and execution time on a real-life dataset collected from meteorological applications.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"47 1","pages":"88-96"},"PeriodicalIF":0.0,"publicationDate":"2011-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91212880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Boosting tuple propagation in multi-relational classification 促进多关系分类中的元组传播

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2011-09-21 DOI: 10.1145/2076623.2076637

Lucantonio Ghionna, G. Greco

引用次数: 1

On the expressiveness of generalization rules for XPath query relaxation 关于XPath查询松弛的泛化规则的可表达性

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2010-08-16 DOI: 10.1145/1866480.1866504

Bettina Fazzinga, S. Flesca, F. Furfaro

{"title":"On the expressiveness of generalization rules for XPath query relaxation","authors":"Bettina Fazzinga, S. Flesca, F. Furfaro","doi":"10.1145/1866480.1866504","DOIUrl":"https://doi.org/10.1145/1866480.1866504","url":null,"abstract":"The problem of defining suitable rewriting mechanisms for XML query languages to support approximate query answering has received a great deal of attention in the last few years, owing to its practical impact in several scenarios. For instance, in the typical scenario of distributed XML data without a shared data scheme, accomplishing the extraction of the information of interest often requires queries to be rewritten into relaxed ones, in order to adapt them to the schemes adopted in the different sources.\u0000 In this paper, rewriting systems for a wide fragment of XPath (which is the core of several languages for manipulating XML data) are investigated, and a general form of rewriting rules (namely, generalization rules) is considered, which subsumes the forms adopted in the most well-known rewriting systems. Specifically, the expressiveness of rewriting systems based on this form of rules is characterized: on the one hand, it is shown that rewriting systems based on generalization rules are incomplete w.r.t. containment (thus, traditional rewriting mechanisms do not suffice to rewrite a query into any more general one). On the other hand, it is also shown that the expressiveness of state-of-the-art rewriting systems can be improved by employing rewriting primitives as simple as those traditionally used, which enable any query to be relaxed into every more general one related to it via homomorphism.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"17 1","pages":"157-168"},"PeriodicalIF":0.0,"publicationDate":"2010-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80976172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

An integrative approach to query optimization in native XML database management systems 原生XML数据库管理系统中查询优化的集成方法

Proceedings. International Database Engineering and Applications Symposium Pub Date : 2010-08-16 DOI: 10.1145/1866480.1866491

A. Weiner, T. Härder

引用次数: 8