22nd International Conference on Data Engineering (ICDE'06)最新文献

筛选
英文 中文
How to Determine a Good Multi-Programming Level for External Scheduling 如何确定外部调度的良好多规划水平
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.78
Bianca Schroeder, Mor Harchol-Balter, A. Iyengar, E. Nahum, A. Wierman
{"title":"How to Determine a Good Multi-Programming Level for External Scheduling","authors":"Bianca Schroeder, Mor Harchol-Balter, A. Iyengar, E. Nahum, A. Wierman","doi":"10.1109/ICDE.2006.78","DOIUrl":"https://doi.org/10.1109/ICDE.2006.78","url":null,"abstract":"Scheduling/prioritization of DBMS transactions is important for many applications that rely on database backends. A convenient way to achieve scheduling is to limit the number of transactions within the database, maintaining most of the transactions in an external queue, which can be ordered as desired by the application. While external scheduling has many advantages in that it doesn’t require changes to internal resources, it is also difficult to get right in that its performance depends critically on the particular multiprogramming limit used (the MPL), i.e. the number of transactions allowed into the database. If the MPL is too low, throughput will suffer, since not all DBMS resources will be utilized. On the other hand, if the MPL is too high, there is insufficient control on scheduling. The question of how to adjust theMPL to achieve both goals simultaneously is an open problem, not just for databases but in system design in general. Herein we study this problem in the context of transactional workloads, both via extensive experimentation and queueing theoretic analysis. We find that the two most critical factors in adjusting the MPL are the number of resources that the workload utilizes and the variability of the transactions’ service demands. We develop a feedback based controller, augmented by queueing theoretic models for automatically adjusting the MPL. Finally, we apply our methods to the specific problem of external prioritization of transactions. We find that external prioritization can be nearly as effective as internal prioritization, without any negative consequences, when the MPL is set appropriately.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73523864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 127
Query Selection Techniques for Efficient Crawling of Structured Web Sources 结构化Web资源高效抓取的查询选择技术
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.124
Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma
{"title":"Query Selection Techniques for Efficient Crawling of Structured Web Sources","authors":"Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma","doi":"10.1109/ICDE.2006.124","DOIUrl":"https://doi.org/10.1109/ICDE.2006.124","url":null,"abstract":"The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are only accessible through Web query forms or via Web service interfaces. Recent research efforts have been focusing on understanding these Web query forms. A critical but still largely unresolved question is: how to efficiently acquire the structured information inside Web databases through iteratively issuing meaningful queries? In this paper we focus on the central issue of enabling efficient Web database crawling through query selection, i.e. how to select good queries to rapidly harvest data records from Web databases. We model each structured Web database as a distinct attribute-value graph. Under this theoretical framework, the database crawling problem is transformed into a graph traversal one that follows \"relational\" links. We show that finding an optimal query selection plan is equivalent to finding a Minimum Weighted Dominating Set of the corresponding database graph, a well-known NP-Complete problem. We propose a suite of query selection techniques aiming at optimizing the query harvest rate. Extensive experimental evaluations over real Web sources and simulations over controlled database servers validate the effectiveness of our techniques and provide insights for future efforts in this","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74084237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 152
MONDRIAN: Annotating and Querying Databases through Colors and Blocks 蒙德里安:通过颜色和块注释和查询数据库
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.102
Floris Geerts, Anastasios Kementsietsidis, D. Milano
{"title":"MONDRIAN: Annotating and Querying Databases through Colors and Blocks","authors":"Floris Geerts, Anastasios Kementsietsidis, D. Milano","doi":"10.1109/ICDE.2006.102","DOIUrl":"https://doi.org/10.1109/ICDE.2006.102","url":null,"abstract":"Annotations play a central role in the curation of scientific databases. Despite their importance, data formats and schemas are not designed to manage the increasing variety of annotations. Moreover, DBMS’s often lack support for storing and querying annotations. Furthermore, annotations and data are only loosely coupled. This paper introduces an annotation-oriented data model for the manipulation and querying of both data and annotations. In particular, the model allows for the specification of annotations on sets of values and for effectively querying the information on their association. We use the concept of block to represent an annotated set of values. Different colors applied to the blocks represent different annotations. We introduce a color query language for our model and prove it to be both complete (it can express all possible queries over the class of annotated databases), and minimal (all the algebra operators are primitive). We present MONDRIAN, a prototype implementation of our annotation mechanism, and we conduct experiments that investigate the set of parameters which influence the evaluation cost for color queries.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75295857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
cgmOLAP: Efficient Parallel Generation and Querying of Terabyte Size ROLAP Data Cubes Terabyte大小的ROLAP数据立方体的高效并行生成和查询
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.32
Ying Chen, A. Rau-Chaplin, F. Dehne, Todd Eavis, D. Green, E. Sithirasenan
{"title":"cgmOLAP: Efficient Parallel Generation and Querying of Terabyte Size ROLAP Data Cubes","authors":"Ying Chen, A. Rau-Chaplin, F. Dehne, Todd Eavis, D. Green, E. Sithirasenan","doi":"10.1109/ICDE.2006.32","DOIUrl":"https://doi.org/10.1109/ICDE.2006.32","url":null,"abstract":"We present the cgmOLAP server, the first fully functional parallel OLAP system able to build data cubes at a rate of more than 1 Terabyte per hour. cgmOLAP incorporates a variety of novel approaches for the parallel computation of full cubes, partial cubes, and iceberg cubes as well as new parallel cube indexing schemes. The cgmOLAP system consists of an application interface, a parallel query engine, a parallel cube materialization engine, meta data and cost model repositories, and shared server components that provide uniform management of I/O, memory, communications, and disk resources.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79162375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Approximately Processing Multi-granularity Aggregate Queries over Data Streams 近似处理数据流上的多粒度聚合查询
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.22
Shouke Qin, Weining Qian, Aoying Zhou
{"title":"Approximately Processing Multi-granularity Aggregate Queries over Data Streams","authors":"Shouke Qin, Weining Qian, Aoying Zhou","doi":"10.1109/ICDE.2006.22","DOIUrl":"https://doi.org/10.1109/ICDE.2006.22","url":null,"abstract":"Aggregate monitoring over data streams is attracting more and more attention in research community due to its broad potential applications. Existing methods suffer two problems, 1) The aggregate functions which could be monitored are restricted to be first-order statistic or monotonic with respect to the window size. 2) Only a limited number of granularity and time scales could be monitored over a stream, thus some interesting patterns might be neglected, and users might be misled by the incomplete changing profile about current data streams. These two impede the development of online mining techniques over data streams, and some kind of breakthrough is urged. In this paper, we employed the powerful tool of fractal analysis to enable the monitoring of both monotonic and non-monotonic aggregates on time-changing data streams. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for accessing the synopsis from O(m) to O(logm), where m is the number of windows to be monitored. With the help of a novel inverted histogram, the statistical summary is compressed to be fit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line. Theoretical analysis show the space and time complexity bound of this method are relatively low, while experimental results prove the applicability and efficiency of the proposed algorithm in different application settings.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80969091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Answering Imprecise Queries over Autonomous Web Databases 回答自治Web数据库上的不精确查询
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.20
Ullas Nambiar, S. Kambhampati
{"title":"Answering Imprecise Queries over Autonomous Web Databases","authors":"Ullas Nambiar, S. Kambhampati","doi":"10.1109/ICDE.2006.20","DOIUrl":"https://doi.org/10.1109/ICDE.2006.20","url":null,"abstract":"Current approaches for answering queries with imprecise constraints require user-specific distance metrics and importance measures for attributes of interest - metrics that are hard to elicit from lay users. We present AIMQ, a domain and user independent approach for answering imprecise queries over autonomous Web databases. We developed methods for query relaxation that use approximate functional dependencies. We also present an approach to automatically estimate the similarity between values of categorical attributes. Experimental results demonstrating the robustness, efficiency and effectiveness of AIMQ are presented. Results of a preliminary user study demonstrating the high precision of the AIMQ system is also provided.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82123271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Robust Cardinality and Cost Estimation for Skyline Operator Skyline算子的鲁棒基数与代价估计
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.131
S. Chaudhuri, Nilesh N. Dalvi, R. Kaushik
{"title":"Robust Cardinality and Cost Estimation for Skyline Operator","authors":"S. Chaudhuri, Nilesh N. Dalvi, R. Kaushik","doi":"10.1109/ICDE.2006.131","DOIUrl":"https://doi.org/10.1109/ICDE.2006.131","url":null,"abstract":"Incorporating the skyline operator inside the relational engine requires solving the cardinality estimation and the cost estimation problem, hitherto unaddressed. We propose robust techniques to estimate the cardinality and the computational cost of Skyline, and through an empirical comparison, show that our technique is substantially more effective than traditional approaches. Finally, we show through an implementation in Microsoft SQL Server that skyline queries can substantially benefit from our techniques.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76344992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 157
LB-Index: A Multi-Resolution Index Structure for Images LB-Index:图像的多分辨率索引结构
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.85
Vebjorn Ljosa, Arnab Bhattacharya, Ambuj K. Singh
{"title":"LB-Index: A Multi-Resolution Index Structure for Images","authors":"Vebjorn Ljosa, Arnab Bhattacharya, Ambuj K. Singh","doi":"10.1109/ICDE.2006.85","DOIUrl":"https://doi.org/10.1109/ICDE.2006.85","url":null,"abstract":"In many domains, the similarity between two images depends on the spatial locations of their features. The earth mover’s distance (EMD), first proposed by Werman et al. [8], measures such similarity. It yields higher-quality image retrieval results than the Lp-norm, quadratic-form distance, and Jeffrey divergence [6], and has also been used for similarity search on contours [3], melodies [7], and graphs [2].","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89094553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Cluster Hull: A Technique for Summarizing Spatial Data Streams 簇壳:一种汇总空间数据流的技术
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.38
J. Hershberger, Nisheeth Shrivastava, S. Suri
{"title":"Cluster Hull: A Technique for Summarizing Spatial Data Streams","authors":"J. Hershberger, Nisheeth Shrivastava, S. Suri","doi":"10.1109/ICDE.2006.38","DOIUrl":"https://doi.org/10.1109/ICDE.2006.38","url":null,"abstract":"Recently there has been a growing interest in detecting patterns and analyzing trends in data that are generated continuously, often delivered in some fixed order and at a rapid rate, in the form of a data stream [5, 6]. When the stream consists of spatial data, its geometric \"shape\" can convey important qualitative aspects of the data set more effectively than many numerical statistics. In a stream setting, where the data must be constantly discarded and compressed, special care must be taken to ensure that the compressed summary faithfully captures the overall shape of the point distribution. We propose a novel scheme, ClusterHulls, to represent the shape of a stream of two-dimensional points. Our scheme is particularly useful when the input contains clusters with widely varying shapes and sizes, and the boundary shape, orientation, or volume of those clusters may be important in the analysis.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80546693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Supporting Keyword Columns with Ontology-based Referential Constraints in DBMS 在DBMS中支持基于本体的引用约束的关键字列
22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.151
E. Chong, Souripriya Das, G. Eadon, Jagannathan Srinivasan
{"title":"Supporting Keyword Columns with Ontology-based Referential Constraints in DBMS","authors":"E. Chong, Souripriya Das, G. Eadon, Jagannathan Srinivasan","doi":"10.1109/ICDE.2006.151","DOIUrl":"https://doi.org/10.1109/ICDE.2006.151","url":null,"abstract":"Keywords are typically used to qualify rows in a table. However, the fact that a keyword denotes a concept, which belongs to a specific knowledge domain, is not semantically enforced in current database systems. This paper proposes defining ontology based referential constraint for such keyword columns. A query on ontology, specified as part of the referential constraint, is used to identify the domain for the keyword column. Furthermore, since ontology may evolve causing change to the domain of the keyword column, the paper proposes use of ontology based transformation functions to either automatically evolve or to recommend refinements for the values in the keyword column. Also, queries on a keyword column can perform semantic match, that is, match a keyword to related terms based on the associated ontology. Thus, the proposed approach of semantically connecting keyword columns to ontologies 1) enhances semantic data integrity, 2) facilitates evolution of keyword columns with the referenced ontology, and 3) enables semantic match queries on keyword columns.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83192418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信