一种基于组件分析和hamming距离的Hadoop集群内容拆分和合并算法

Q4 Business, Management and Accounting

International Journal of Technology, Policy and Management Pub Date : 2019-12-07 DOI:10.1504/ijtpm.2019.10025765

Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim

{"title":"一种基于组件分析和hamming距离的Hadoop集群内容拆分和合并算法","authors":"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim","doi":"10.1504/ijtpm.2019.10025765","DOIUrl":null,"url":null,"abstract":"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.","PeriodicalId":55889,"journal":{"name":"International Journal of Technology, Policy and Management","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance\",\"authors\":\"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim\",\"doi\":\"10.1504/ijtpm.2019.10025765\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.\",\"PeriodicalId\":55889,\"journal\":{\"name\":\"International Journal of Technology, Policy and Management\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Technology, Policy and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijtpm.2019.10025765\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Business, Management and Accounting\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Technology, Policy and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijtpm.2019.10025765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Business, Management and Accounting","Score":null,"Total":0}

引用次数: 0

摘要

大数据数据集的分布式存储和处理已经成为数据科学的一个重要组成部分。随着物联网(iot)技术的进步，大数据变得越来越重要。因此，处理这类数据最需要关注的是是否容易获得和是否准确。为了在数据处理中有效地利用内容的分割和合并，迄今为止已经进行了各种研究。但是，它们在Hadoop中缺乏适当集群的生成。在本文中，我们展示了一种利用数据处理的分割和合并过程的有效方法。我们使用了成分分析和汉明距离，根据分裂值生成了三个聚类，这在该工作领域是新颖的。实验结果表明，我们提出的方法在离散簇和时间消耗方面具有更好的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance

Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Technology, Policy and Management Business, Management and Accounting-Business, Management and Accounting (all)

CiteScore

1.00

自引率

0.00%

发文量

期刊介绍： IJTPM is a refereed international journal that provides a professional and scholarly forum in the emerging field of decision making and problem solving in the integrated area of technology policy and management at the operational, organisational and public policy levels.