C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk
{"title":"RoadRunner框架用于高效和可扩展的大数据处理","authors":"C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk","doi":"10.1145/2801948.2801963","DOIUrl":null,"url":null,"abstract":"In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.","PeriodicalId":305252,"journal":{"name":"Proceedings of the 19th Panhellenic Conference on Informatics","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The RoadRunner framework for efficient and scalable processing of big data\",\"authors\":\"C. Doulkeridis, Akrivi Vlachou, Panagiotis Nikitopoulos, Panagiotis Tampakis, Mei Saouk\",\"doi\":\"10.1145/2801948.2801963\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.\",\"PeriodicalId\":305252,\"journal\":{\"name\":\"Proceedings of the 19th Panhellenic Conference on Informatics\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th Panhellenic Conference on Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2801948.2801963\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th Panhellenic Conference on Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2801948.2801963","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The RoadRunner framework for efficient and scalable processing of big data
In this paper, we present the overall architecture of RoadRunner, a Hadoop-based framework that enhances the efficiency of rank-aware query processing by introducing various optimizations to Hadoop, without changing its internal operation. RoadRunner focuses on a specific class of queries that involve ranking, such as top-k queries and top-k joins, as well as on preference-aware queries, such as skyline queries, which are tightly related. For this class of queries, we identify improvements on various stages of MapReduce processing, which result in improved performance without sacrificing scalability. We describe the RoadRunner framework, along with individual modules and their roles, and we demonstrate the merits of the proposed framework by means of showcase query examples.