{"title":"Soft Real-Time Hadoop Scheduler for Big Data Processing in Smart Cities","authors":"Ciprian Barbieru, Florin Pop","doi":"10.1109/AINA.2016.122","DOIUrl":null,"url":null,"abstract":"We live in a world where every electronic device generates data, and does so in a variety of ways that respect a multitude of patterns particular to every device and user. Some users user their phone to browse the Internet on their daily commute, some check it for updates every hour, and some may use it constantly throughout the day to accomplish different tasks. Even the same device can be used in variety of ways, let alone different devices. Besides the user generated data, there is also machine generated data, which can have a more foreseeable pattern, like nightly backups or scheduled tasks, but usually imply more CPU or I/O intensive tasks than the sporadic ones generated by human users. In a context where the analyzed data size is constantly increasing and we start to talk about Big Data in more and more daily tasks, we need a way to handle all these diverse tasks that serve a variety of purposes. Some of this data must be sometimes analyzed as fast as possible, or, in some cases the analysis can be done at the end of the day, as part of a batch process. In order to handle all this diversity we design a real-time and job scheduler in Hadoop for Big Data processing that addresses both the problem of small tasks that need to be executed in real time, and in the same time, adjust for long-running jobs where time of completion is not that strictly defined. The case study is applied as support for Smart City applications that are gathered / routed / stored via mobile devices and processed / diffused via a more standard Clouds.","PeriodicalId":438655,"journal":{"name":"2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA)","volume":"3 5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AINA.2016.122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
We live in a world where every electronic device generates data, and does so in a variety of ways that respect a multitude of patterns particular to every device and user. Some users user their phone to browse the Internet on their daily commute, some check it for updates every hour, and some may use it constantly throughout the day to accomplish different tasks. Even the same device can be used in variety of ways, let alone different devices. Besides the user generated data, there is also machine generated data, which can have a more foreseeable pattern, like nightly backups or scheduled tasks, but usually imply more CPU or I/O intensive tasks than the sporadic ones generated by human users. In a context where the analyzed data size is constantly increasing and we start to talk about Big Data in more and more daily tasks, we need a way to handle all these diverse tasks that serve a variety of purposes. Some of this data must be sometimes analyzed as fast as possible, or, in some cases the analysis can be done at the end of the day, as part of a batch process. In order to handle all this diversity we design a real-time and job scheduler in Hadoop for Big Data processing that addresses both the problem of small tasks that need to be executed in real time, and in the same time, adjust for long-running jobs where time of completion is not that strictly defined. The case study is applied as support for Smart City applications that are gathered / routed / stored via mobile devices and processed / diffused via a more standard Clouds.