Job starvation avoidance with alleviation of data skewness in Big Data infrastructure

Sankari Subbiah, S. Mala, S. Nayagam
{"title":"Job starvation avoidance with alleviation of data skewness in Big Data infrastructure","authors":"Sankari Subbiah, S. Mala, S. Nayagam","doi":"10.1109/ICCCT2.2017.7972264","DOIUrl":null,"url":null,"abstract":"During the age of rush in the need for big data, Hadoop is a postulate or cloud-based platform that has been heavily encouraged for all solutions in the business world's big data problems. Parallel execution of jobs consists of large data sets is done through map reduce in the hadoop cluster. The completion of job time will depend on the slowest running task in the job. The entire job is extended if one particular job takes longer time to finish and it is done by the delayer. An inequality in the measure of data allocated to each individual task is referred to as Data skewness. An efficient dynamic data splitting approach on Hadoop called the Hybrid scheduler who monitors the samples while running batch jobs and allocates resources to slaves depending on the complexity of data and the time taken for processing. In this paper, the effectiveness of web swarming is showcased using hadoop eliminating Distributed Denial of Service (DDoS) attack detection scenarios in the Web servers. Query processing is done through Map Reduce in traditional Hadoop clusters and is replaced by the proposed Block chain query processing algorithm. Thereby improvise the execution time of the assigned task in the proposed system to mitigate the data skewness. The main aim of this paper is to avoid job starvation thus minimizing the response time efficiently during the process and mitigating data skewness in existing system.","PeriodicalId":445567,"journal":{"name":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd International Conference on Computing and Communications Technologies (ICCCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCT2.2017.7972264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

During the age of rush in the need for big data, Hadoop is a postulate or cloud-based platform that has been heavily encouraged for all solutions in the business world's big data problems. Parallel execution of jobs consists of large data sets is done through map reduce in the hadoop cluster. The completion of job time will depend on the slowest running task in the job. The entire job is extended if one particular job takes longer time to finish and it is done by the delayer. An inequality in the measure of data allocated to each individual task is referred to as Data skewness. An efficient dynamic data splitting approach on Hadoop called the Hybrid scheduler who monitors the samples while running batch jobs and allocates resources to slaves depending on the complexity of data and the time taken for processing. In this paper, the effectiveness of web swarming is showcased using hadoop eliminating Distributed Denial of Service (DDoS) attack detection scenarios in the Web servers. Query processing is done through Map Reduce in traditional Hadoop clusters and is replaced by the proposed Block chain query processing algorithm. Thereby improvise the execution time of the assigned task in the proposed system to mitigate the data skewness. The main aim of this paper is to avoid job starvation thus minimizing the response time efficiently during the process and mitigating data skewness in existing system.
在大数据基础设施中避免工作饥饿和缓解数据偏度
在对大数据需求激增的时代,Hadoop是一种假设或基于云的平台,它被大力鼓励用于解决商业世界的大数据问题。并行执行由大数据集组成的作业是通过hadoop集群中的map reduce完成的。作业的完成时间将取决于作业中运行最慢的任务。如果一个特定的工作需要更长的时间来完成,并且它是由延迟者完成的,那么整个工作就被延长了。分配给每个单独任务的数据度量中的不平等称为数据偏度。Hadoop上一种高效的动态数据分割方法,称为Hybrid调度器,它在运行批处理作业时监视样本,并根据数据的复杂性和处理所需的时间将资源分配给slave。在本文中,通过hadoop消除web服务器中的分布式拒绝服务(DDoS)攻击检测场景,展示了web集群的有效性。在传统的Hadoop集群中,查询处理是通过Map Reduce完成的,并被本文提出的区块链查询处理算法所取代。从而在所提出的系统中临时调整所分配任务的执行时间,以减轻数据偏度。本文的主要目的是避免作业饥饿,从而最大限度地减少过程中的响应时间,并减轻现有系统中的数据偏差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信