22nd International Conference on Data Engineering (ICDE'06)最新文献

How to Determine a Good Multi-Programming Level for External Scheduling 如何确定外部调度的良好多规划水平

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.78

Bianca Schroeder, Mor Harchol-Balter, A. Iyengar, E. Nahum, A. Wierman

{"title":"How to Determine a Good Multi-Programming Level for External Scheduling","authors":"Bianca Schroeder, Mor Harchol-Balter, A. Iyengar, E. Nahum, A. Wierman","doi":"10.1109/ICDE.2006.78","DOIUrl":"https://doi.org/10.1109/ICDE.2006.78","url":null,"abstract":"Scheduling/prioritization of DBMS transactions is important for many applications that rely on database backends. A convenient way to achieve scheduling is to limit the number of transactions within the database, maintaining most of the transactions in an external queue, which can be ordered as desired by the application. While external scheduling has many advantages in that it doesn’t require changes to internal resources, it is also difficult to get right in that its performance depends critically on the particular multiprogramming limit used (the MPL), i.e. the number of transactions allowed into the database. If the MPL is too low, throughput will suffer, since not all DBMS resources will be utilized. On the other hand, if the MPL is too high, there is insufficient control on scheduling. The question of how to adjust theMPL to achieve both goals simultaneously is an open problem, not just for databases but in system design in general. Herein we study this problem in the context of transactional workloads, both via extensive experimentation and queueing theoretic analysis. We find that the two most critical factors in adjusting the MPL are the number of resources that the workload utilizes and the variability of the transactions’ service demands. We develop a feedback based controller, augmented by queueing theoretic models for automatically adjusting the MPL. Finally, we apply our methods to the specific problem of external prioritization of transactions. We find that external prioritization can be nearly as effective as internal prioritization, without any negative consequences, when the MPL is set appropriately.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"5 1","pages":"60-60"},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73523864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 127

Query Selection Techniques for Efficient Crawling of Structured Web Sources 结构化Web资源高效抓取的查询选择技术

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.124

Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma

{"title":"Query Selection Techniques for Efficient Crawling of Structured Web Sources","authors":"Ping Wu, Ji-Rong Wen, Huan Liu, Wei-Ying Ma","doi":"10.1109/ICDE.2006.124","DOIUrl":"https://doi.org/10.1109/ICDE.2006.124","url":null,"abstract":"The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are only accessible through Web query forms or via Web service interfaces. Recent research efforts have been focusing on understanding these Web query forms. A critical but still largely unresolved question is: how to efficiently acquire the structured information inside Web databases through iteratively issuing meaningful queries? In this paper we focus on the central issue of enabling efficient Web database crawling through query selection, i.e. how to select good queries to rapidly harvest data records from Web databases. We model each structured Web database as a distinct attribute-value graph. Under this theoretical framework, the database crawling problem is transformed into a graph traversal one that follows \"relational\" links. We show that finding an optimal query selection plan is equivalent to finding a Minimum Weighted Dominating Set of the corresponding database graph, a well-known NP-Complete problem. We propose a suite of query selection techniques aiming at optimizing the query harvest rate. Extensive experimental evaluations over real Web sources and simulations over controlled database servers validate the effectiveness of our techniques and provide insights for future efforts in this","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"35 1","pages":"47-47"},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74084237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 152

MONDRIAN: Annotating and Querying Databases through Colors and Blocks 蒙德里安:通过颜色和块注释和查询数据库

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.102

Floris Geerts, Anastasios Kementsietsidis, D. Milano

{"title":"MONDRIAN: Annotating and Querying Databases through Colors and Blocks","authors":"Floris Geerts, Anastasios Kementsietsidis, D. Milano","doi":"10.1109/ICDE.2006.102","DOIUrl":"https://doi.org/10.1109/ICDE.2006.102","url":null,"abstract":"Annotations play a central role in the curation of scientific databases. Despite their importance, data formats and schemas are not designed to manage the increasing variety of annotations. Moreover, DBMS’s often lack support for storing and querying annotations. Furthermore, annotations and data are only loosely coupled. This paper introduces an annotation-oriented data model for the manipulation and querying of both data and annotations. In particular, the model allows for the specification of annotations on sets of values and for effectively querying the information on their association. We use the concept of block to represent an annotated set of values. Different colors applied to the blocks represent different annotations. We introduce a color query language for our model and prove it to be both complete (it can express all possible queries over the class of annotated databases), and minimal (all the algebra operators are primitive). We present MONDRIAN, a prototype implementation of our annotation mechanism, and we conduct experiments that investigate the set of parameters which influence the evaluation cost for color queries.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"43 1","pages":"82-82"},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75295857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 109

cgmOLAP: Efficient Parallel Generation and Querying of Terabyte Size ROLAP Data Cubes Terabyte大小的ROLAP数据立方体的高效并行生成和查询

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.32

Ying Chen, A. Rau-Chaplin, F. Dehne, Todd Eavis, D. Green, E. Sithirasenan

引用次数: 20

Approximately Processing Multi-granularity Aggregate Queries over Data Streams 近似处理数据流上的多粒度聚合查询

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.22

Shouke Qin, Weining Qian, Aoying Zhou

{"title":"Approximately Processing Multi-granularity Aggregate Queries over Data Streams","authors":"Shouke Qin, Weining Qian, Aoying Zhou","doi":"10.1109/ICDE.2006.22","DOIUrl":"https://doi.org/10.1109/ICDE.2006.22","url":null,"abstract":"Aggregate monitoring over data streams is attracting more and more attention in research community due to its broad potential applications. Existing methods suffer two problems, 1) The aggregate functions which could be monitored are restricted to be first-order statistic or monotonic with respect to the window size. 2) Only a limited number of granularity and time scales could be monitored over a stream, thus some interesting patterns might be neglected, and users might be misled by the incomplete changing profile about current data streams. These two impede the development of online mining techniques over data streams, and some kind of breakthrough is urged. In this paper, we employed the powerful tool of fractal analysis to enable the monitoring of both monotonic and non-monotonic aggregates on time-changing data streams. The monotony property of aggregate monitoring is revealed and monotonic search space is built to decrease the time overhead for accessing the synopsis from O(m) to O(logm), where m is the number of windows to be monitored. With the help of a novel inverted histogram, the statistical summary is compressed to be fit in limited main memory, so that high aggregates on windows of any length can be detected accurately and efficiently on-line. Theoretical analysis show the space and time complexity bound of this method are relatively low, while experimental results prove the applicability and efficiency of the proposed algorithm in different application settings.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"46 1","pages":"67-67"},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80969091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

Answering Imprecise Queries over Autonomous Web Databases 回答自治Web数据库上的不精确查询

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.20

Ullas Nambiar, S. Kambhampati

引用次数: 66

Robust Cardinality and Cost Estimation for Skyline Operator Skyline算子的鲁棒基数与代价估计

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.131

S. Chaudhuri, Nilesh N. Dalvi, R. Kaushik

引用次数: 157

LB-Index: A Multi-Resolution Index Structure for Images LB-Index:图像的多分辨率索引结构

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.85

Vebjorn Ljosa, Arnab Bhattacharya, Ambuj K. Singh

引用次数: 8

Cluster Hull: A Technique for Summarizing Spatial Data Streams 簇壳:一种汇总空间数据流的技术

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.38

J. Hershberger, Nisheeth Shrivastava, S. Suri

引用次数: 6

Supporting Keyword Columns with Ontology-based Referential Constraints in DBMS 在DBMS中支持基于本体的引用约束的关键字列

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.151

E. Chong, Souripriya Das, G. Eadon, Jagannathan Srinivasan

引用次数: 5