{"title":"一种在虚拟化环境下发现Hadoop map reduce最优性能的最适合因子的方法","authors":"Solaimurugan Vellaipandiyan, V. Srikrishnan","doi":"10.1109/ICCIC.2014.7238471","DOIUrl":null,"url":null,"abstract":"Map Reduce pioneered by Google is mainly employed in Big Data analytics. In Map Reduce environment, most of the algorithms are re-used for mining the data. Prediction of execution time and system overhead of MapReduce job is very vital, from which performance shall be ascertained. Cloud computing is widely used as a computing platform in business and academic communities. Performance plays a major role, when user runs an application in the cloud. User may want to estimate the application execution time (latency) before submitting a Task or a Job. Hadoop clusters are deployed on Cloud environment performing the experiment. System overhead is determined by running Map Reduce job over Hadoop Clusters. While performing the experiment, metrics such as network I/O, CPU, Swap utilization, Time to complete the job and RSS, VSZ were captured and evaluated in order to diagnose, how performance of Hadoop is influenced by reconstructing the block size and split size with respect to block size.","PeriodicalId":187874,"journal":{"name":"2014 IEEE International Conference on Computational Intelligence and Computing Research","volume":"81 6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An approach to discover the best-fit factors for the optimal performance of Hadoop map reduce in virtualized environment\",\"authors\":\"Solaimurugan Vellaipandiyan, V. Srikrishnan\",\"doi\":\"10.1109/ICCIC.2014.7238471\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Map Reduce pioneered by Google is mainly employed in Big Data analytics. In Map Reduce environment, most of the algorithms are re-used for mining the data. Prediction of execution time and system overhead of MapReduce job is very vital, from which performance shall be ascertained. Cloud computing is widely used as a computing platform in business and academic communities. Performance plays a major role, when user runs an application in the cloud. User may want to estimate the application execution time (latency) before submitting a Task or a Job. Hadoop clusters are deployed on Cloud environment performing the experiment. System overhead is determined by running Map Reduce job over Hadoop Clusters. While performing the experiment, metrics such as network I/O, CPU, Swap utilization, Time to complete the job and RSS, VSZ were captured and evaluated in order to diagnose, how performance of Hadoop is influenced by reconstructing the block size and split size with respect to block size.\",\"PeriodicalId\":187874,\"journal\":{\"name\":\"2014 IEEE International Conference on Computational Intelligence and Computing Research\",\"volume\":\"81 6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Computational Intelligence and Computing Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIC.2014.7238471\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Computational Intelligence and Computing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIC.2014.7238471","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An approach to discover the best-fit factors for the optimal performance of Hadoop map reduce in virtualized environment
Map Reduce pioneered by Google is mainly employed in Big Data analytics. In Map Reduce environment, most of the algorithms are re-used for mining the data. Prediction of execution time and system overhead of MapReduce job is very vital, from which performance shall be ascertained. Cloud computing is widely used as a computing platform in business and academic communities. Performance plays a major role, when user runs an application in the cloud. User may want to estimate the application execution time (latency) before submitting a Task or a Job. Hadoop clusters are deployed on Cloud environment performing the experiment. System overhead is determined by running Map Reduce job over Hadoop Clusters. While performing the experiment, metrics such as network I/O, CPU, Swap utilization, Time to complete the job and RSS, VSZ were captured and evaluated in order to diagnose, how performance of Hadoop is influenced by reconstructing the block size and split size with respect to block size.