Peter Braun, A. Cuzzocrea, C. Leung, Adam G. M. Pazdor, Joglas Souza, S. Tanbeer
{"title":"Pattern Mining from big IoT Data with fog Computing: Models, Issues, and Research Perspectives","authors":"Peter Braun, A. Cuzzocrea, C. Leung, Adam G. M. Pazdor, Joglas Souza, S. Tanbeer","doi":"10.1109/CCGRID.2019.00075","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00075","url":null,"abstract":"As we are living in the era of big data, huge volumes of a wide variety of complex data-which can be of different levels of veracity-are generated or collected at a high velocity from rich sources of data in various real-life applications. A rich source of these big data sources is the Internet of Things (IoT), which include a collection of sensors, smartphones and other mobile devices, wearable devices, as well as other \"things\" that are capable to operate within the existing Internet infrastructure. Embedded in these big data are valuable knowledge and useful information. Hence, the research problem of data mining from big IoT data have drawn attention of many researchers as it aims to discover implicit, previously unknown and potentially useful information and knowledge from the data. For instance, frequent pattern mining finds sets of frequently co-occurring items in the IoT domains. Associative classification discovers rules revealing relationships among items within the frequent patterns and their associations with the corresponding class labels. Induction based classification uses decision tree or random forest to learn from old big IoT for classifying or making predictions on new data. Over the past quarter of a century, many serial, distributed, parallel, and MapReduce-based (Hadook-based and Spark-based) big data mining algorithms have been proposed. These algorithms are run in local computers, distributed and parallel environments, clusters, grids, clouds and/or data centers. In this paper, we review some of these algorithms, discuss issues and research prospective in mining classification patterns from these big IoT data in fog. Our case study on a real-life application shows the feasibility on classifying real-life big IoT data over fog for urban analytics.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"249 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123583079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingzhi Liu, Long Cheng, T. Ozcelebi, John Murphy, J. Lukkien
{"title":"Deep Reinforcement Learning for IoT Network Dynamic Clustering in Edge Computing","authors":"Qingzhi Liu, Long Cheng, T. Ozcelebi, John Murphy, J. Lukkien","doi":"10.1109/CCGRID.2019.00077","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00077","url":null,"abstract":"Processing big data generated in large Internet of Things (IoT) networks is challenging current techniques. To date, a lot of network clustering approaches have been proposed to improve the performance of data collection in IoT. However, most of them focus on partitioning networks with static topologies, and thus they are not optimal in handling the case with moving objects in the networks. Moreover, to the best of our knowledge, none of them has ever considered the performance of computing in edge servers. To solve these problems, we propose a highly efficient IoT network dynamic clustering solution in edge computing using deep reinforcement learning (DRL). Our approach can both fulfill the data communication requirements from IoT networks and load-balancing requirements from edge servers, and thus provide a great opportunity for future high performance IoT data analytics. We implement our approach using a Deep Q-learning Network (DQN) model, and our preliminary experimental results show that the DQN solution can achieve higher scores in cluster partitioning compared with the current static benchmark solution.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120982592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abel Souza, Mohamad Rezaei, E. Laure, Johan Tordsson
{"title":"Hybrid Resource Management for HPC and Data Intensive Workloads","authors":"Abel Souza, Mohamad Rezaei, E. Laure, Johan Tordsson","doi":"10.1109/CCGRID.2019.00054","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00054","url":null,"abstract":"High Performance Computing (HPC) and Data Intensive (DI) workloads have been executed on separate clusters using different tools for resource and application management. With increasing convergence, where modern applications are composed of both types of jobs in complex workflows, this separation becomes a growing overhead and the need for a common platform increases. Executing both workload classes on the same clusters not only enables hybrid workflows, but can also increase system efficiency, as available hardware often is not fully utilized by applications. While HPC systems are typically managed in a coarse grained fashion, with exclusive resource allocations, DI systems employ a finer grained regime, enabling dynamic allocation and control based on application needs. On the path to full convergence, a useful and less intrusive step is a hybrid resource management system allowing the execution of DI applications on top of standard HPC scheduling systems. In this paper we present the architecture of a hybrid system enabling dual-level scheduling for DI jobs in HPC infrastructures. Our system takes advantage of real-time resource profiling to efficiently co-schedule HPC and DI applications. The architecture is easily extensible to current and new types of distributed applications, allowing efficient combination of hybrid workloads on HPC resources with increased job throughput and higher overall resource utilization. The implementation is based on the Slurm and Mesos resource managers for HPC and DI jobs. Experimental evaluations in a real cluster based on a set of representative HPC and DI applications demonstrate that our hybrid architecture improves resource utilization by 20%, with 12% decrease on queue makespan while still meeting all deadlines for HPC jobs.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128097390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Welcome Message from the CCGRID 2019 Workshop/Tutorial Chairs","authors":"G. Pallis, A. Toosi, B. Sotomayor","doi":"10.1109/ccgrid.2019.00007","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00007","url":null,"abstract":"Welcome to the 19th Annual IEEE/ACM International Symposium in Cluster, Cloud, and Grid Computing (CCGrid 2019)! At this year’s CCGRID, the conference’s main program is supplemented by a diverse offering of workshops and tutorials that will allow you to explore a variety of topics in greater depth. On Wednesday, May 14th, we are hosting seven workshops that will include paper presentations, keynotes, invited talks, and discussions:","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134311528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"[Publisher's information]","authors":"","doi":"10.1109/ccgrid.2019.00091","DOIUrl":"https://doi.org/10.1109/ccgrid.2019.00091","url":null,"abstract":"","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117060408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proximity-Aware Traffic Routing in Distributed Fog Computing Platforms","authors":"Alice Fahs, G. Pierre","doi":"10.1109/CCGRID.2019.00062","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00062","url":null,"abstract":"Container orchestration engines such as Kubernetes do not take into account the geographical location of application replicas when deciding which replica should handle which request. This makes them ill-suited to act as a general-purpose fog computing platforms where the proximity between end users and the replica serving them is essential. We present proxy-mity, a proximity-aware traffic routing system for distributed fog computing platforms. It seamlessly integrates in Kubernetes, and provides very simple control mechanisms to allow system administrators to address the necessary trade-off between reducing the user-to-replica latencies and balancing the load equally across replicas. proxy-mity is very lightweight and it can reduce average user-to-replica latencies by as much as 90% while allowing the system administrators to control the level of load imbalance in their system.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122207408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mukhtiar Bano, Umar Ahmad Qureshi, R. N. B. Rais, M. Tufail, A. Qayyum
{"title":"Miracle: An Agile Colocation Platform for Enabling XaaS Cloud Architecture","authors":"Mukhtiar Bano, Umar Ahmad Qureshi, R. N. B. Rais, M. Tufail, A. Qayyum","doi":"10.1109/CCGRID.2019.00078","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00078","url":null,"abstract":"Cloud Service Providers (CSPs) offer services to large scale business organizations using different cloud frameworks including Platform as a Service (PaaS), Software as a Service (SaaS), Infrastructure as a Service (IaaS) and follow various business models to realize profitability while addressing the challenges regarding scalability, guaranteed connectivity and security. In order to address such issues, Colocation service providers (third-party organizations) lease physical infrastructure services in geographically distributed locations known as colocation points (Co-Lo). Organization eventually request services available within a given colocation facility. Such a solution is challenging in a way that not all service offerings are available on a given Co-Lo provider. The proposed platform 'Miracle' addresses aforementioned challenges by providing agility and cost reduction to organizations aiming to leverage colocation services. Our solution leverages Software Defined Networking along with service and performance awareness technology to virtualize the organization's presence in a Co-Lo. Hence, organization can be associated or re-associated on the fly to required Co-Lo according to its service connectivity needs at a given time. Our solution may benefit cloud services industry and their customers while adopting the open-source computing technologies such as virtualization, dynamic provisioning environment, and multi-tenancy resulting in offering increased efficiency in a profitable and cost effective manner.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130683982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data Driven Priority Scheduling on a Spark Streaming System","authors":"Tobi Ajila, S. Majumdar","doi":"10.1109/CCGRID.2019.00072","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00072","url":null,"abstract":"Big data has become essential for businesses as it enables companies and organizations to gather insights from their data and use it to determine marketing opportunities, assist decision-making or even to find new business opportunities. Companies spend a great deal of effort collecting large amounts of data, which in some cases must be processed in real-time in order to capitalize on business opportunities. Predicting the expected input load at a given point in time can be very difficult and sometimes impossible. As a result, a great deal of effort is put into creating techniques to address varying input loads. A widely used approach is dynamic resource provisioning, but resource provisioners may not react in time to address the resource shortage which can result in increased processing latencies. This paper presents a priority scheduling technique that can be used in conjunction with dynamic and static resource provisioning. This approach allows users to assign a priority to input data items. The scheduler ensures that higher priority data items are given precedence over lower priority data items. This means that when resources become constrained the higher priority data items receive a greater share of resources and experience lower queueing delays in comparison to low priority items. A prototype for the data driven priority scheduler is implemented on the Spark Streaming system.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114095085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Oleksii Serhiienko, Panagiotis Gkikopoulos, Josef Spillner
{"title":"Extensible Declarative Management of Cloud Resources across Providers","authors":"Oleksii Serhiienko, Panagiotis Gkikopoulos, Josef Spillner","doi":"10.1109/CCGRID.2019.00087","DOIUrl":"https://doi.org/10.1109/CCGRID.2019.00087","url":null,"abstract":"Tags and labels are annotations on resources in many commercial public cloud models. Little is known about the extent of tagging in commercially relevant settings and there is an absence of automated software to handle tags. We show that by introducing an extensible tag management middleware based on cloud functions, tags can be turned into a powerful declarative means of cloud management. Our universal connector middleware is demonstrated by a typical deployment administration scenario involving both AWS and Google Cloud Platform services.","PeriodicalId":234571,"journal":{"name":"2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114378718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}