G. Karvounarakis, Todd J. Green, Z. Ives, V. Tannen
{"title":"Collaborative data sharing via update exchange and provenance","authors":"G. Karvounarakis, Todd J. Green, Z. Ives, V. Tannen","doi":"10.1145/2500127","DOIUrl":null,"url":null,"abstract":"Recent work [Ives et al. 2005] proposed a new class of systems for supporting data sharing among scientific and other collaborations: this new collaborative data sharing system connects heterogeneous logical peers using a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to incorporate related data from other peers as well. To achieve this, every peer's data and updates propagate along the mappings to the other peers. However, this operation, termed update exchange, is filtered by trust conditions—expressing what data and sources a peer judges to be authoritative—which may cause a peer to reject another's updates. In order to support such filtering, updates carry provenance information.\n This article develops methods for realizing such systems: we build upon techniques from data integration, data exchange, incremental view maintenance, and view update to propagate updates along mappings, both to derived and optionally to source instances. We incorporate a novel model for tracking data provenance, such that curators may filter updates based on trust conditions over this provenance. We implement our techniques in a layer above an off-the-shelf RDBMS, and we experimentally demonstrate the viability of these techniques in the Orchestra prototype system.","PeriodicalId":50915,"journal":{"name":"ACM Transactions on Database Systems","volume":"146 1","pages":"19"},"PeriodicalIF":2.2000,"publicationDate":"2013-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Database Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2500127","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 45
Abstract
Recent work [Ives et al. 2005] proposed a new class of systems for supporting data sharing among scientific and other collaborations: this new collaborative data sharing system connects heterogeneous logical peers using a network of schema mappings. Each peer has a locally controlled and edited database instance, but wants to incorporate related data from other peers as well. To achieve this, every peer's data and updates propagate along the mappings to the other peers. However, this operation, termed update exchange, is filtered by trust conditions—expressing what data and sources a peer judges to be authoritative—which may cause a peer to reject another's updates. In order to support such filtering, updates carry provenance information.
This article develops methods for realizing such systems: we build upon techniques from data integration, data exchange, incremental view maintenance, and view update to propagate updates along mappings, both to derived and optionally to source instances. We incorporate a novel model for tracking data provenance, such that curators may filter updates based on trust conditions over this provenance. We implement our techniques in a layer above an off-the-shelf RDBMS, and we experimentally demonstrate the viability of these techniques in the Orchestra prototype system.
最近的工作[Ives et al. 2005]提出了一类新的系统,用于支持科学和其他协作之间的数据共享:这种新的协作数据共享系统使用模式映射网络连接异构逻辑对等体。每个对等点都有一个本地控制和编辑的数据库实例,但也希望合并来自其他对等点的相关数据。为了实现这一点,每个对等点的数据和更新沿着映射传播到其他对等点。然而,这种称为更新交换的操作是通过信任条件(表示对等方判断哪些数据和数据源是权威的)过滤的,这可能导致对等方拒绝另一方的更新。为了支持这种过滤,更新带有来源信息。本文开发了实现这类系统的方法:我们建立在数据集成、数据交换、增量视图维护和视图更新等技术的基础上,以沿着映射将更新传播到派生实例和可选的源实例。我们采用了一种新颖的模型来跟踪数据来源,这样管理员就可以根据对该来源的信任条件过滤更新。我们在一个现成的RDBMS之上的一层实现了我们的技术,并且我们通过实验证明了这些技术在Orchestra原型系统中的可行性。
期刊介绍:
Heavily used in both academic and corporate R&D settings, ACM Transactions on Database Systems (TODS) is a key publication for computer scientists working in data abstraction, data modeling, and designing data management systems. Topics include storage and retrieval, transaction management, distributed and federated databases, semantics of data, intelligent databases, and operations and algorithms relating to these areas. In this rapidly changing field, TODS provides insights into the thoughts of the best minds in database R&D.