{"title":"The impact of data locality on the performance of a SaaS cloud with real-time data-intensive applications","authors":"Georgios L. Stavrinides, H. Karatza","doi":"10.1109/DISTRA.2017.8167683","DOIUrl":null,"url":null,"abstract":"As cloud computing continues to gain momentum, big data analytics are now offered as Software as a Service (SaaS). Besides the heterogeneity and multi-tenancy of the underlying virtualized environment, scheduling such real-time, data-intensive, embarrassingly parallel applications in a SaaS cloud involves another serious challenge: data locality. Consequently, data-aware scheduling policies should be employed, in order to effectively exploit data locality, while at the same time taking into account the other attributes of the workload and the characteristics of the resources. Towards this direction, we investigate via simulation the impact of data locality on the performance of a SaaS cloud, where real-time, data-intensive bags-of-tasks are scheduled dynamically, under various data availability conditions. A non-data-aware baseline scheduling policy is compared with two proposed data-aware heuristics, in an attempt to shed light on the effect of data locality awareness on the system performance.","PeriodicalId":109971,"journal":{"name":"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/ACM 21st International Symposium on Distributed Simulation and Real Time Applications (DS-RT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DISTRA.2017.8167683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
As cloud computing continues to gain momentum, big data analytics are now offered as Software as a Service (SaaS). Besides the heterogeneity and multi-tenancy of the underlying virtualized environment, scheduling such real-time, data-intensive, embarrassingly parallel applications in a SaaS cloud involves another serious challenge: data locality. Consequently, data-aware scheduling policies should be employed, in order to effectively exploit data locality, while at the same time taking into account the other attributes of the workload and the characteristics of the resources. Towards this direction, we investigate via simulation the impact of data locality on the performance of a SaaS cloud, where real-time, data-intensive bags-of-tasks are scheduled dynamically, under various data availability conditions. A non-data-aware baseline scheduling policy is compared with two proposed data-aware heuristics, in an attempt to shed light on the effect of data locality awareness on the system performance.