RoadRunner框架用于高效和可扩展的大数据处理

Proceedings of the 19th Panhellenic Conference on Informatics Pub Date : 2015-10-01 DOI:10.1145/2801948.2801963

C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk

{"title":"RoadRunner框架用于高效和可扩展的大数据处理","authors":"C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk","doi":"10.1145/2801948.2801963","DOIUrl":null,"url":null,"abstract":"In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.","PeriodicalId":305252,"journal":{"name":"Proceedings of the 19th Panhellenic Conference on Informatics","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The RoadRunner framework for efficient and scalable processing of big data\",\"authors\":\"C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk\",\"doi\":\"10.1145/2801948.2801963\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.\",\"PeriodicalId\":305252,\"journal\":{\"name\":\"Proceedings of the 19th Panhellenic Conference on Informatics\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th Panhellenic Conference on Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2801948.2801963\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th Panhellenic Conference on Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2801948.2801963","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在本文中，我们介绍了RoadRunner的整体架构，这是一个基于Hadoop的框架，它通过向Hadoop引入各种优化来提高排名感知查询处理的效率，而不改变其内部操作。RoadRunner专注于涉及排序的特定查询类，例如top-k查询和top-k连接，以及偏好感知查询，例如紧密相关的skyline查询。对于这类查询，我们确定了MapReduce处理的各个阶段的改进，从而在不牺牲可伸缩性的情况下提高了性能。我们描述了RoadRunner框架，以及各个模块和它们的角色，并通过展示查询示例演示了所建议框架的优点。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The RoadRunner framework for efficient and scalable processing of big data

In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 19th Panhellenic Conference on Informatics

自引率

0.00%

发文量