一种基于组件分析和hamming距离的Hadoop集群内容拆分和合并算法

Q4 Business, Management and Accounting
Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim
{"title":"一种基于组件分析和hamming距离的Hadoop集群内容拆分和合并算法","authors":"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim","doi":"10.1504/ijtpm.2019.10025765","DOIUrl":null,"url":null,"abstract":"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.","PeriodicalId":55889,"journal":{"name":"International Journal of Technology, Policy and Management","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance\",\"authors\":\"Balraj Singh, H. Verma, Gulshan Kumar, Hye-jin Kim\",\"doi\":\"10.1504/ijtpm.2019.10025765\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.\",\"PeriodicalId\":55889,\"journal\":{\"name\":\"International Journal of Technology, Policy and Management\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Technology, Policy and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijtpm.2019.10025765\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Business, Management and Accounting\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Technology, Policy and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijtpm.2019.10025765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Business, Management and Accounting","Score":null,"Total":0}
引用次数: 0

摘要

大数据数据集的分布式存储和处理已经成为数据科学的一个重要组成部分。随着物联网(iot)技术的进步,大数据变得越来越重要。因此,处理这类数据最需要关注的是是否容易获得和是否准确。为了在数据处理中有效地利用内容的分割和合并,迄今为止已经进行了各种研究。但是,它们在Hadoop中缺乏适当集群的生成。在本文中,我们展示了一种利用数据处理的分割和合并过程的有效方法。我们使用了成分分析和汉明距离,根据分裂值生成了三个聚类,这在该工作领域是新颖的。实验结果表明,我们提出的方法在离散簇和时间消耗方面具有更好的效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An improved content splitting and merging algorithm for Hadoop clusters using component analysis and hamming distance
Distributed storage and processing of dataset of big data have become an integrated component of data science. With the technology progress towards the Internet of Things (IoTs), big data becomes more important. Therefore, processing of such data needs utmost concern for the ease of availability and accuracy. Various research has been executed till date for the efficient use of splitting and merging of content in the processing of data. But, somehow they lack in the generation of proper clusters in Hadoop. In this paper, we have shown an efficient approach of using splitting and merging process of data processing. We have used component analysis and hamming distance to generate thee clusters depending on the split values which is novel in this domain of work. The experimented results of our proposed approach provide better efficiency in term of discrete clusters and time consumption.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
International Journal of Technology, Policy and Management
International Journal of Technology, Policy and Management Business, Management and Accounting-Business, Management and Accounting (all)
CiteScore
1.00
自引率
0.00%
发文量
24
期刊介绍: IJTPM is a refereed international journal that provides a professional and scholarly forum in the emerging field of decision making and problem solving in the integrated area of technology policy and management at the operational, organisational and public policy levels.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信