Addressing Challenges of Hadoop for BIG Data Analysis

2018 International Conference on Communication, Computing and Internet of Things (IC3IoT) Pub Date : 2018-02-01 DOI:10.1109/IC3IOT.2018.8668136

R. Khullar, Tushar Sharma, T. Choudhury, R. Mittal

{"title":"Addressing Challenges of Hadoop for BIG Data Analysis","authors":"R. Khullar, Tushar Sharma, T. Choudhury, R. Mittal","doi":"10.1109/IC3IOT.2018.8668136","DOIUrl":null,"url":null,"abstract":"Data has become necessary part of every individual, industry, economy, business function and organization. Miscellaneous industries, machines and institutions are expanding their analytical data at digital world at a very high rate. As this data set increases, selecting the relevant information becomes a laborious task. Therefore, this on-command and on-demand nature of digital universe gives creation of a data category called the Big-Data because of its sheer velocity, volume and variety. It is basically employed to differentiate the various datasets and their sizes are above the ability of the database software tools to manage, evaluate and store. It proposes exclusive computational and analytical challenges which includes measurement errors, scalability and storage bottleneck and noise accumulation.Because of a specific characteristic of the Big-Data they are put in a distributed file system Hadoop (HDFS). However, Hadoop is impartially complex. As Hadoop is new to users, this research paper discusses the important challenges and issues faced during the data mining and deployment of the file system. Aim of this paper is to make user comfortable with Hadoop.","PeriodicalId":155587,"journal":{"name":"2018 International Conference on Communication, Computing and Internet of Things (IC3IoT)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Communication, Computing and Internet of Things (IC3IoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3IOT.2018.8668136","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Data has become necessary part of every individual, industry, economy, business function and organization. Miscellaneous industries, machines and institutions are expanding their analytical data at digital world at a very high rate. As this data set increases, selecting the relevant information becomes a laborious task. Therefore, this on-command and on-demand nature of digital universe gives creation of a data category called the Big-Data because of its sheer velocity, volume and variety. It is basically employed to differentiate the various datasets and their sizes are above the ability of the database software tools to manage, evaluate and store. It proposes exclusive computational and analytical challenges which includes measurement errors, scalability and storage bottleneck and noise accumulation.Because of a specific characteristic of the Big-Data they are put in a distributed file system Hadoop (HDFS). However, Hadoop is impartially complex. As Hadoop is new to users, this research paper discusses the important challenges and issues faced during the data mining and deployment of the file system. Aim of this paper is to make user comfortable with Hadoop.

查看原文本刊更多论文

解决Hadoop在大数据分析中的挑战

数据已经成为每个人、行业、经济、业务功能和组织的必要组成部分。各种各样的行业、机器和机构正在以非常高的速度扩展他们在数字世界的分析数据。随着数据集的增加，选择相关信息成为一项费力的任务。因此，数字宇宙的这种随需应变的特性创造了一种被称为大数据的数据类别，因为它的速度、数量和种类都非常多。它基本上是用来区分各种数据集，它们的大小超出了数据库软件工具的管理、评估和存储能力。它提出了独特的计算和分析挑战，包括测量误差，可扩展性和存储瓶颈以及噪声积累。由于大数据的特定特性，它们被放在分布式文件系统Hadoop (HDFS)中。然而，Hadoop相当复杂。由于Hadoop对用户来说是新的，本研究论文讨论了在数据挖掘和文件系统部署过程中面临的重要挑战和问题。本文的目的是让用户熟悉Hadoop。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 International Conference on Communication, Computing and Internet of Things (IC3IoT)

自引率

0.00%

发文量