Soft Real-Time Hadoop Scheduler for Big Data Processing in Smart Cities

Ciprian Barbieru, Florin Pop
{"title":"Soft Real-Time Hadoop Scheduler for Big Data Processing in Smart Cities","authors":"Ciprian Barbieru, Florin Pop","doi":"10.1109/AINA.2016.122","DOIUrl":null,"url":null,"abstract":"We live in a world where every electronic device generates data, and does so in a variety of ways that respect a multitude of patterns particular to every device and user. Some users user their phone to browse the Internet on their daily commute, some check it for updates every hour, and some may use it constantly throughout the day to accomplish different tasks. Even the same device can be used in variety of ways, let alone different devices. Besides the user generated data, there is also machine generated data, which can have a more foreseeable pattern, like nightly backups or scheduled tasks, but usually imply more CPU or I/O intensive tasks than the sporadic ones generated by human users. In a context where the analyzed data size is constantly increasing and we start to talk about Big Data in more and more daily tasks, we need a way to handle all these diverse tasks that serve a variety of purposes. Some of this data must be sometimes analyzed as fast as possible, or, in some cases the analysis can be done at the end of the day, as part of a batch process. In order to handle all this diversity we design a real-time and job scheduler in Hadoop for Big Data processing that addresses both the problem of small tasks that need to be executed in real time, and in the same time, adjust for long-running jobs where time of completion is not that strictly defined. The case study is applied as support for Smart City applications that are gathered / routed / stored via mobile devices and processed / diffused via a more standard Clouds.","PeriodicalId":438655,"journal":{"name":"2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA)","volume":"3 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINA.2016.122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

We live in a world where every electronic device generates data, and does so in a variety of ways that respect a multitude of patterns particular to every device and user. Some users user their phone to browse the Internet on their daily commute, some check it for updates every hour, and some may use it constantly throughout the day to accomplish different tasks. Even the same device can be used in variety of ways, let alone different devices. Besides the user generated data, there is also machine generated data, which can have a more foreseeable pattern, like nightly backups or scheduled tasks, but usually imply more CPU or I/O intensive tasks than the sporadic ones generated by human users. In a context where the analyzed data size is constantly increasing and we start to talk about Big Data in more and more daily tasks, we need a way to handle all these diverse tasks that serve a variety of purposes. Some of this data must be sometimes analyzed as fast as possible, or, in some cases the analysis can be done at the end of the day, as part of a batch process. In order to handle all this diversity we design a real-time and job scheduler in Hadoop for Big Data processing that addresses both the problem of small tasks that need to be executed in real time, and in the same time, adjust for long-running jobs where time of completion is not that strictly defined. The case study is applied as support for Smart City applications that are gathered / routed / stored via mobile devices and processed / diffused via a more standard Clouds.
面向智慧城市大数据处理的软实时Hadoop调度程序
在我们生活的世界里,每个电子设备都会生成数据,并且以各种方式生成数据,这些方式尊重每个设备和用户特有的多种模式。一些用户在日常通勤时用手机浏览互联网,一些人每小时查看一次更新,还有一些人可能一整天都在使用手机来完成不同的任务。即使是同一设备也可以有多种使用方式,更不用说不同的设备了。除了用户生成的数据之外,还有机器生成的数据,这些数据可能具有更可预见的模式,例如夜间备份或计划任务,但通常意味着比人类用户生成的零星任务更多的CPU或I/O密集型任务。在分析数据量不断增加的背景下,我们开始在越来越多的日常任务中讨论大数据,我们需要一种方法来处理所有这些服务于各种目的的不同任务。有时必须尽可能快地分析其中一些数据,或者在某些情况下,可以在一天结束时作为批处理过程的一部分进行分析。为了处理所有这些多样性,我们在Hadoop中设计了一个用于大数据处理的实时和作业调度器,它既解决了需要实时执行的小任务的问题,同时也调整了完成时间没有严格定义的长时间作业。该案例研究用于支持通过移动设备收集/路由/存储并通过更标准的云处理/扩散的智慧城市应用程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信