{"title":"Hadoop MapReduce的TaskTracker感知调度","authors":"Jisha S. Manjaly, Varghese S. Chooralil","doi":"10.1109/ICACC.2013.103","DOIUrl":null,"url":null,"abstract":"Hadoop is a framework for processing large amount of data in parallel with the help of Hadoop Distributed File System (HDFS) and MapReduce framework. Job scheduling is an important process in Hadoop MapReduce. Hadoop comes with three types of schedulers namely FIFO, Fair and Capacity Scheduler. The schedulers are now a plug gable component in the Hadoop MapReduce framework. When jobs have a dependency on an external service like database or Web service may leads to the failure of tasks due to overloading. In this scenario, Hadoop needs to re-run the tasks in another slots. To address this issue, Task Tracker aware scheduling has introduced. This scheduler enables users to configure a maximum load per Task Tracker in the Job Configuration itself. The algorithm will not allow a task to run and fail if the load of the Task Tracker reaches its threshold for the job. Also this scheduler allows the users to select the Task Tracker's per Job in the Job configuration.","PeriodicalId":109537,"journal":{"name":"2013 Third International Conference on Advances in Computing and Communications","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"TaskTracker Aware Scheduling for Hadoop MapReduce\",\"authors\":\"Jisha S. Manjaly, Varghese S. Chooralil\",\"doi\":\"10.1109/ICACC.2013.103\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hadoop is a framework for processing large amount of data in parallel with the help of Hadoop Distributed File System (HDFS) and MapReduce framework. Job scheduling is an important process in Hadoop MapReduce. Hadoop comes with three types of schedulers namely FIFO, Fair and Capacity Scheduler. The schedulers are now a plug gable component in the Hadoop MapReduce framework. When jobs have a dependency on an external service like database or Web service may leads to the failure of tasks due to overloading. In this scenario, Hadoop needs to re-run the tasks in another slots. To address this issue, Task Tracker aware scheduling has introduced. This scheduler enables users to configure a maximum load per Task Tracker in the Job Configuration itself. The algorithm will not allow a task to run and fail if the load of the Task Tracker reaches its threshold for the job. Also this scheduler allows the users to select the Task Tracker's per Job in the Job configuration.\",\"PeriodicalId\":109537,\"journal\":{\"name\":\"2013 Third International Conference on Advances in Computing and Communications\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Third International Conference on Advances in Computing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACC.2013.103\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Third International Conference on Advances in Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACC.2013.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hadoop is a framework for processing large amount of data in parallel with the help of Hadoop Distributed File System (HDFS) and MapReduce framework. Job scheduling is an important process in Hadoop MapReduce. Hadoop comes with three types of schedulers namely FIFO, Fair and Capacity Scheduler. The schedulers are now a plug gable component in the Hadoop MapReduce framework. When jobs have a dependency on an external service like database or Web service may leads to the failure of tasks due to overloading. In this scenario, Hadoop needs to re-run the tasks in another slots. To address this issue, Task Tracker aware scheduling has introduced. This scheduler enables users to configure a maximum load per Task Tracker in the Job Configuration itself. The algorithm will not allow a task to run and fail if the load of the Task Tracker reaches its threshold for the job. Also this scheduler allows the users to select the Task Tracker's per Job in the Job configuration.