Hadoop的核心亲和力和文件上传性能

Joong-Yeon Cho, Hyun-Wook Jin, Min Lee, K. Schwan
{"title":"Hadoop的核心亲和力和文件上传性能","authors":"Joong-Yeon Cho, Hyun-Wook Jin, Min Lee, K. Schwan","doi":"10.1145/2534645.2534651","DOIUrl":null,"url":null,"abstract":"The MapReduce programming model is introduced for big-data processing, where the data nodes perform both data storing and computation. Thus, we need to understand different resource requirements of data storing and computation tasks and schedule these efficiently over multi-core processors. The core affinity defines mapping between a set of cores and a given task. The core affinity can be decided based on resource requirements of a task because this largely affects the efficiency of computation, memory, and I/O resource utilization. In this paper, we analyze the impact of core affinity on the file upload performance of Hadoop Distributed File System (HDFS). Our study can provide the insight into the process scheduling issues on big-data processing systems. We also suggest a framework for dynamic core affinity based on our observations and show that a preliminary implementation can improve the throughput more than 40% compared with default Linux system.","PeriodicalId":166804,"journal":{"name":"International Symposium on Design and Implementation of Symbolic Computation Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"On the core affinity and file upload performance of Hadoop\",\"authors\":\"Joong-Yeon Cho, Hyun-Wook Jin, Min Lee, K. Schwan\",\"doi\":\"10.1145/2534645.2534651\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The MapReduce programming model is introduced for big-data processing, where the data nodes perform both data storing and computation. Thus, we need to understand different resource requirements of data storing and computation tasks and schedule these efficiently over multi-core processors. The core affinity defines mapping between a set of cores and a given task. The core affinity can be decided based on resource requirements of a task because this largely affects the efficiency of computation, memory, and I/O resource utilization. In this paper, we analyze the impact of core affinity on the file upload performance of Hadoop Distributed File System (HDFS). Our study can provide the insight into the process scheduling issues on big-data processing systems. We also suggest a framework for dynamic core affinity based on our observations and show that a preliminary implementation can improve the throughput more than 40% compared with default Linux system.\",\"PeriodicalId\":166804,\"journal\":{\"name\":\"International Symposium on Design and Implementation of Symbolic Computation Systems\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Symposium on Design and Implementation of Symbolic Computation Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2534645.2534651\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Design and Implementation of Symbolic Computation Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2534645.2534651","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

在大数据处理中引入MapReduce编程模型,其中数据节点执行数据存储和计算。因此,我们需要了解数据存储和计算任务的不同资源需求,并在多核处理器上有效地调度这些资源。核心关联定义了一组核心与给定任务之间的映射。核心关联可以根据任务的资源需求来决定,因为这在很大程度上影响计算、内存和I/O资源利用的效率。本文分析了核心亲和力对Hadoop分布式文件系统(HDFS)文件上传性能的影响。我们的研究可以为大数据处理系统的过程调度问题提供深入的见解。根据我们的观察,我们还提出了一个动态核心亲缘性框架,并表明与默认Linux系统相比,初步实现可以将吞吐量提高40%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On the core affinity and file upload performance of Hadoop
The MapReduce programming model is introduced for big-data processing, where the data nodes perform both data storing and computation. Thus, we need to understand different resource requirements of data storing and computation tasks and schedule these efficiently over multi-core processors. The core affinity defines mapping between a set of cores and a given task. The core affinity can be decided based on resource requirements of a task because this largely affects the efficiency of computation, memory, and I/O resource utilization. In this paper, we analyze the impact of core affinity on the file upload performance of Hadoop Distributed File System (HDFS). Our study can provide the insight into the process scheduling issues on big-data processing systems. We also suggest a framework for dynamic core affinity based on our observations and show that a preliminary implementation can improve the throughput more than 40% compared with default Linux system.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信