Proceedings of the 2009 ACM SIGMOD International Conference on Management of data最新文献_第9页

Session details: Research session 19: semi-structured data management 会议详情:研究会议19:半结构化数据管理

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/3257467

J. Shanmugasundaram

引用次数: 0

Enabling enterprise mashups over unstructured text feeds with InfoSphere MashupHub and SystemT 使用InfoSphere MashupHub和SystemT在非结构化文本提要上启用企业mashup

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559999

David E. Simmen, Frederick Reiss, Yunyao Li, Suresh Thalamati

引用次数: 4

PRIMA: archiving and querying historical data with evolving schemas PRIMA:用演进的模式对历史数据进行归档和查询

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559970

H. J. Moon, C. Curino, MyungWon Ham, C. Zaniolo

{"title":"PRIMA: archiving and querying historical data with evolving schemas","authors":"H. J. Moon, C. Curino, MyungWon Ham, C. Zaniolo","doi":"10.1145/1559845.1559970","DOIUrl":"https://doi.org/10.1145/1559845.1559970","url":null,"abstract":"Schema evolution poses serious challenges in historical data management. Traditionally, historical data have been archived either by (i) migrating them into the current schema version that is well-understood by users but compromising archival quality, or (ii) by maintaining them under the original schema version in which the data was originally created, leading to perfect archival quality, but forcing users to formulate queries against complex histories of evolving schemas. In the PRIMA system, we achieve the best of both approaches, by (i) archiving historical data under the schema version under which they were originally created, and (ii) letting users express temporal queries using the current schema version. Thus, in PRIMA, the system rewrites the queries to the (potentially many) pertinent versions of the evolving schema. Moreover, the system o ers automatic documentation of the schema history, and allows the users to pose temporal queries over the metadata history itself. The proposed demonstration highlights the system features exploiting both a synthetic-educational running example and the real-life evolution histories (schemas and data), which include hundreds of schema versions from Wikipedia and Ensembl. The demonstration off ers a thorough walk-through of the system features and a hands-on system testing phase, where the audiences are invited to directly interact with the advanced query interface of PRIMA.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127588284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

Simplifying XML schema: effortless handling of nondeterministic regular expressions 简化XML模式:轻松处理不确定正则表达式

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559922

G. Bex, W. Gelade, W. Martens, F. Neven

{"title":"Simplifying XML schema: effortless handling of nondeterministic regular expressions","authors":"G. Bex, W. Gelade, W. Martens, F. Neven","doi":"10.1145/1559845.1559922","DOIUrl":"https://doi.org/10.1145/1559845.1559922","url":null,"abstract":"Whether beloved or despised, XML Schema is momentarily the only industrially accepted schema language for XML and is unlikely to become obsolete any time soon. Nevertheless, many nontransparent restrictions unnecessarily complicate the design of XSDs. For instance, complex content models in XML Schema are constrained by the infamous unique particle attribution (UPA) constraint. In formal language theoretic terms, this constraint restricts content models to deterministic regular expressions. As the latter constitute a semantic notion and no simple corresponding syntactical characterization is known, it is very difficult for non-expert users to understand exactly when and why content models do or do not violate UPA. In the present paper, we therefore investigate solutions to relieve users from the burden of UPA by automatically transforming nondeterministic expressions into concise deterministic ones defining the same language or constituting good approximations. The presented techniques facilitate XSD construction by reducing the design task at hand more towards the complexity of the modeling task. In addition, our algorithms can serve as a plug-in for any model management tool which supports export to XML Schema format.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127642162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

Keyword search on structured and semi-structured data 结构化和半结构化数据的关键字搜索

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559966

Yi Chen, Wei Wang, Ziyang Liu, Xuemin Lin

引用次数: 207

Search your memory ! - an associative memory based desktop search system 搜索你的记忆!-基于联想记忆的桌面搜索系统

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559992

Jidong Chen, Hang Guo, Wentao Wu, Chunxin Xie

引用次数: 14

Vispedia: on-demand data integration for interactive visualization and exploration Vispedia:用于交互式可视化和探索的按需数据集成

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1560003

Bryan Chan, Justin Talbot, Leslie Wu, Nathan Sakunkoo, Mike Cammarano, P. Hanrahan

引用次数: 13

Efficient type-ahead search on relational data: a TASTIER approach 对关系数据的高效提前输入搜索:一种更有吸引力的方法

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559918

Guoliang Li, S. Ji, Chen Li, Jianhua Feng

{"title":"Efficient type-ahead search on relational data: a TASTIER approach","authors":"Guoliang Li, S. Ji, Chen Li, Jianhua Feng","doi":"10.1145/1559845.1559918","DOIUrl":"https://doi.org/10.1145/1559845.1559918","url":null,"abstract":"Existing keyword-search systems in relational databases require users to submit a complete query to compute answers. Often users feel \"left in the dark\" when they have limited knowledge about the data, and have to use a try-and-see approach for modifying queries and finding answers. In this paper we propose a novel approach to keyword search in the relational world, called Tastier. A Tastier system can bring instant gratification to users by supporting type-ahead search, which finds answers \"on the fly\" as the user types in query keywords. A main challenge is how to achieve a high interactive speed for large amounts of data in multiple tables, so that a query can be answered efficiently within milliseconds. We propose efficient index structures and algorithms for finding relevant answers on-the-fly by joining tuples in the database. We devise a partition-based method to improve query performance by grouping highly relevant tuples and pruning irrelevant tuples efficiently. We also develop a technique to answer a query efficiently by predicting the highly relevant complete queries for the user. We have conducted a thorough experimental evaluation of the proposed techniques on real data sets to demonstrate the efficiency and practicality of this new search paradigm.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121182796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 123

ROX: run-time optimization of XQueries ROX: XQueries的运行时优化

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559910

Riham Abdel Kader, P. Boncz, S. Manegold, M. van Keulen

{"title":"ROX: run-time optimization of XQueries","authors":"Riham Abdel Kader, P. Boncz, S. Manegold, M. van Keulen","doi":"10.1145/1559845.1559910","DOIUrl":"https://doi.org/10.1145/1559845.1559910","url":null,"abstract":"Optimization of complex XQueries combining many XPath steps and joins is currently hindered by the absence of good cardinality estimation and cost models for XQuery. Additionally, the state-of-the-art of even relational query optimization still struggles to cope with cost model estimation errors that increase with plan size, as well as with the effect of correlated joins and selections. In this research, we propose to radically depart from the traditional path of separating the query compilation and query execution phases, by having the optimizer execute, materialize partial results, and use sampling based estimation techniques to observe the characteristics of intermediates. The proposed technique takes as input a Join Graph where the edges are either equi-joins or XPath steps, and the execution environment provides value- and structural-join algorithms, as well as structural and value-based indices. While run-time optimization with sampling removes many of the vulnerabilities of classical optimizers, it brings its own challenges with respect to keeping resource usage under control, both with respect to the materialization of intermediates, as well as the cost of plan exploration using sampling. Our approach deals with these issues by limiting the run-time search space to so-called \"zero-investment algorithms for which sampling can be guaranteed to be strictly linear in sample size. All operators and XML value indices used by ROX for sampling have the zero-investment property. We perform extensive experimental evaluation on large XML datasets that shows that our run-time query optimizer finds good query plans in a robust fashion and has limited run-time overhead.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116393487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Taming the storage dragon: the adventures of hoTMaN 驯服存储龙:hoTMaN的冒险

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559949

Shahram Ghandeharizadeh, A. Goodney, Chetan Sharma, Chris Bissell, Felipe Cariño, Naveen Nannapaneni, Alex Wergeles, Aber Whitcomb

引用次数: 3