Muhammad Salman, Diyanatul Husna, Adhitya Wicaksono, A. A. P. Ratna
{"title":"Evaluation and Analysis of Capacity Scheduler and Fair Scheduler in Hadoop Framework on Big Data Technology","authors":"Muhammad Salman, Diyanatul Husna, Adhitya Wicaksono, A. A. P. Ratna","doi":"10.1145/3293663.3293680","DOIUrl":null,"url":null,"abstract":"Apache Hadoop is an open source framework that implements MapReduce. It is scalable, reliable, and fault tolerant. Scheduling is an important process in Hadoop MapReduce. It is because scheduling has responsibility to allocate resources for running applications based on resource capacity, queues, running tasks, and the number of users. Changing single node to multi node Hadoop cluster can optimize HDFS, but quite costly. Scheduler performs the function of scheduling based on resource requirements, such as memory, CPU, disk, and network. The most general purpose of scheduling algorithm is minimizing the time of completing a task. Hadoop Scheduling is an independent module where users are able to design their own scheduler based on the application's actual need. So it can fulfill the specific need of the business in accordance with the desired result. This research will analyze the characteristic of Capacity Scheduler and Fair Scheduler.","PeriodicalId":420290,"journal":{"name":"International Conference on Artificial Intelligence and Virtual Reality","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Artificial Intelligence and Virtual Reality","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3293663.3293680","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Apache Hadoop is an open source framework that implements MapReduce. It is scalable, reliable, and fault tolerant. Scheduling is an important process in Hadoop MapReduce. It is because scheduling has responsibility to allocate resources for running applications based on resource capacity, queues, running tasks, and the number of users. Changing single node to multi node Hadoop cluster can optimize HDFS, but quite costly. Scheduler performs the function of scheduling based on resource requirements, such as memory, CPU, disk, and network. The most general purpose of scheduling algorithm is minimizing the time of completing a task. Hadoop Scheduling is an independent module where users are able to design their own scheduler based on the application's actual need. So it can fulfill the specific need of the business in accordance with the desired result. This research will analyze the characteristic of Capacity Scheduler and Fair Scheduler.