{"title":"实现MapReduce(类)框架的性能和可编程性","authors":"Jiayang Guo, G. Agrawal","doi":"10.1109/HiPC.2018.00043","DOIUrl":null,"url":null,"abstract":"Programmability and performance are often considered alternatives in the context of HPC programming systems. For example, general purpose frameworks like MPI are associated with high performance, and though MapReduce and similar frameworks have demonstrated high programmability, it is also well accepted that they fall short in terms of performance. Providing abstractions that maintain high programmability and performance remains an open question. In this paper, we introduce two different variations of the original MapReduce API, We demonstrate efficient implementations of the three APIs, focusing on how the API differences impact middleware implementation, and examine the resulting performance. Furthermore, to understand how application characteristics impact relative performance of the three systems, we develop and validate a performance model. Overall, we show that a MapReduce-like AP that only requires small additional effort from programmers can provide high performance, outperforming Spark significantly.","PeriodicalId":113335,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","volume":"28 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Achieving Performance and Programmability for MapReduce(-Like) Frameworks\",\"authors\":\"Jiayang Guo, G. Agrawal\",\"doi\":\"10.1109/HiPC.2018.00043\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Programmability and performance are often considered alternatives in the context of HPC programming systems. For example, general purpose frameworks like MPI are associated with high performance, and though MapReduce and similar frameworks have demonstrated high programmability, it is also well accepted that they fall short in terms of performance. Providing abstractions that maintain high programmability and performance remains an open question. In this paper, we introduce two different variations of the original MapReduce API, We demonstrate efficient implementations of the three APIs, focusing on how the API differences impact middleware implementation, and examine the resulting performance. Furthermore, to understand how application characteristics impact relative performance of the three systems, we develop and validate a performance model. Overall, we show that a MapReduce-like AP that only requires small additional effort from programmers can provide high performance, outperforming Spark significantly.\",\"PeriodicalId\":113335,\"journal\":{\"name\":\"2018 IEEE 25th International Conference on High Performance Computing (HiPC)\",\"volume\":\"28 3\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 25th International Conference on High Performance Computing (HiPC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HiPC.2018.00043\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2018.00043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Achieving Performance and Programmability for MapReduce(-Like) Frameworks
Programmability and performance are often considered alternatives in the context of HPC programming systems. For example, general purpose frameworks like MPI are associated with high performance, and though MapReduce and similar frameworks have demonstrated high programmability, it is also well accepted that they fall short in terms of performance. Providing abstractions that maintain high programmability and performance remains an open question. In this paper, we introduce two different variations of the original MapReduce API, We demonstrate efficient implementations of the three APIs, focusing on how the API differences impact middleware implementation, and examine the resulting performance. Furthermore, to understand how application characteristics impact relative performance of the three systems, we develop and validate a performance model. Overall, we show that a MapReduce-like AP that only requires small additional effort from programmers can provide high performance, outperforming Spark significantly.