Proceedings of the 2009 ACM SIGMOD International Conference on Management of data最新文献_第8页

AIDE: ad-hoc intents detection engine over query logs AIDE:针对查询日志的特别意图检测引擎

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559990

Yunliang Jiang, Hui-Ting Yang, K. Chang, Yi-Shin Chen

引用次数: 0

Optimizing i/o-intensive transactions in highly interactive applications 在高度交互的应用程序中优化i/o密集型事务

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559927

M. Sharaf, Panos K. Chrysanthis, Alexandros Labrinidis, C. Amza

{"title":"Optimizing i/o-intensive transactions in highly interactive applications","authors":"M. Sharaf, Panos K. Chrysanthis, Alexandros Labrinidis, C. Amza","doi":"10.1145/1559845.1559927","DOIUrl":"https://doi.org/10.1145/1559845.1559927","url":null,"abstract":"The performance provided by an interactive online database system is typically measured in terms of meeting certain pre-specified Service Level Agreements (SLAs), with expected transaction latency being the most commonly used type of SLA. This form of SLA acts as a soft deadline for each transaction, and user satisfaction can be measured in terms of minimizing tardiness, that is, the deviation from SLA. This objective is further complicated for I/O-intensive transactions, where the storage system becomes the performance bottleneck. Moreover, common I/O scheduling policies employed by the Operating System with a goal of improving I/O throughput or average latency may run counter to optimizing per-transaction performance since the Operating System is typically oblivious to the application high-level SLA specifications. In this paper, we propose a new SLA-aware policy for scheduling I/O requests of database transactions. Our proposed policy synergistically combines novel deadline-aware scheduling policies for database transactions with features of Operating System scheduling policies designed for improving I/O throughput. This enables our proposed policy to dynamically adapt to workload and consistently provide the best performance.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"1712 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129427495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

The design of the force.com multitenant internet application development platform force.com多租户互联网应用开发平台的设计

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559942

C. Weissman, Steve Bobrowski

引用次数: 248

Fast and dynamic OLAP exploration using UDFs 使用udf进行快速动态的OLAP探索

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559989

Zhibo Chen, C. Ordonez, Carlos Garcia-Alvarado

引用次数: 10

Entity resolution with iterative blocking 具有迭代阻塞的实体解析

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559870

Steven Euijong Whang, David Menestrina, G. Koutrika, M. Theobald, H. Garcia-Molina

{"title":"Entity resolution with iterative blocking","authors":"Steven Euijong Whang, David Menestrina, G. Koutrika, M. Theobald, H. Garcia-Molina","doi":"10.1145/1559845.1559870","DOIUrl":"https://doi.org/10.1145/1559845.1559870","url":null,"abstract":"Entity Resolution (ER) is the problem of identifying which records in a database refer to the same real-world entity. An exhaustive ER process involves computing the similarities between pairs of records, which can be very expensive for large datasets. Various blocking techniques can be used to enhance the performance of ER by dividing the records into blocks in multiple ways and only comparing records within the same block. However, most blocking techniques process blocks separately and do not exploit the results of other blocks. In this paper, we propose an iterative blocking framework where the ER results of blocks are reflected to subsequently processed blocks. Blocks are now iteratively processed until no block contains any more matching records. Compared to simple blocking, iterative blocking may achieve higher accuracy because reflecting the ER results of blocks to other blocks may generate additional record matches. Iterative blocking may also be more efficient because processing a block now saves the processing time for other blocks. We implement a scalable iterative blocking system and demonstrate that iterative blocking can be more accurate and efficient than blocking for large datasets.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128368611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 246

A comparison of approaches to large-scale data analysis 大规模数据分析方法的比较

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559865

Andrew Pavlo, Erik Paulson, A. Rasin, D. Abadi, D. DeWitt, S. Madden, M. Stonebraker

{"title":"A comparison of approaches to large-scale data analysis","authors":"Andrew Pavlo, Erik Paulson, A. Rasin, D. Abadi, D. DeWitt, S. Madden, M. Stonebraker","doi":"10.1145/1559845.1559865","DOIUrl":"https://doi.org/10.1145/1559845.1559865","url":null,"abstract":"There is currently considerable enthusiasm around the MapReduce (MR) paradigm for large-scale data analysis [17]. Although the basic control flow of this framework has existed in parallel SQL database management systems (DBMS) for over 20 years, some have called MR a dramatically new computing model [8, 17]. In this paper, we describe and compare both paradigms. Furthermore, we evaluate both kinds of systems in terms of performance and development complexity. To this end, we define a benchmark consisting of a collection of tasks that we have run on an open source version of MR as well as on two parallel DBMSs. For each task, we measure each system's performance for various degrees of parallelism on a cluster of 100 nodes. Our results reveal some interesting trade-offs. Although the process to load data into and tune the execution of parallel DBMSs took much longer than the MR system, the observed performance of these DBMSs was strikingly better. We speculate about the causes of the dramatic performance difference and consider implementation concepts that future systems should take from both kinds of architectures.","PeriodicalId":344093,"journal":{"name":"Proceedings of the 2009 ACM SIGMOD International Conference on Management of data","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123721974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1244

Session details: Industrial session 6: industrial directions 会议详情:产业板块6:产业方向

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/3257475

Mehul A. Shah

引用次数: 0

Detecting and resolving unsound workflow views for correct provenance analysis 检测和解决不健全的工作流视图，以进行正确的来源分析

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559903

Peng Sun, Ziyang Liu, S. Davidson, Yi Chen

引用次数: 22

Combining keyword search and forms for ad hoc querying of databases 结合关键字搜索和表单，用于数据库的特别查询

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/1559845.1559883

E. Chu, A. Baid, Xiaoyong Chai, A. Doan, J. Naughton

引用次数: 159

Session details: Special invited session on human-computer interaction with information 会议详情:特别邀请的人机交互与信息

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data Pub Date : 2009-06-29 DOI: 10.1145/3257476

Jeffrey S. Pierce

引用次数: 0