Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim
{"title":"一种基于组件分析和hamming距离的Hadoop集群内容拆分和合并算法","authors":"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim","doi":"10.1504/ijtpm.2019.10025765","DOIUrl":null,"url":null,"abstract":"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.","PeriodicalId":55889,"journal":{"name":"International Journal of Technology, Policy and Management","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance\",\"authors\":\"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim\",\"doi\":\"10.1504/ijtpm.2019.10025765\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.\",\"PeriodicalId\":55889,\"journal\":{\"name\":\"International Journal of Technology, Policy and Management\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Technology, Policy and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijtpm.2019.10025765\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Business, Management and Accounting\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Technology, Policy and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijtpm.2019.10025765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Business, Management and Accounting","Score":null,"Total":0}
An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance
Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.
期刊介绍:
IJTPM is a refereed international journal that provides a professional and scholarly forum in the emerging field of decision making and problem solving in the integrated area of technology policy and management at the operational, organisational and public policy levels.