{"title":"RLSK: A Job Scheduler for Federated Kubernetes Clusters based on Reinforcement Learning","authors":"Jiaming Huang, C. Xiao, Weigang Wu","doi":"10.1109/IC2E48712.2020.00019","DOIUrl":"https://doi.org/10.1109/IC2E48712.2020.00019","url":null,"abstract":"Job scheduling in cluster is often considered as a difficult online decision-making problem, and its solution depends largely on the understanding of the workload and environment. People usually first propose a simple heuristic scheduling algorithm, and then perform repeated and tedious manual tests and adjustments based on the characteristics of the workload to gradually improve the algorithm. In this work, focusing on multi-cluster environments, load balancing and efficient scheduling, we present RLSK, a deep reinforcement learning based job scheduler for scheduling independent batch jobs among multiple federated cloud computing clusters adaptively. By directly specifying high-level scheduling targets, RLSK interacts with the system environment and automatically learns scheduling strategies from experience without any prior knowledge assumed over the underlying multi-cluster environment and human instructions, which avoids people’s tedious testing and tuning work. We implement our scheduler based on Kubernetes, and conduct simulations to evaluate the performance of our design. The results show that, RLSK can outperform traditional scheduling algorithms.","PeriodicalId":173494,"journal":{"name":"2020 IEEE International Conference on Cloud Engineering (IC2E)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133701067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On The Scalability of Blockchain Systems","authors":"N. Sohrabi, Z. Tari","doi":"10.1109/IC2E48712.2020.00020","DOIUrl":"https://doi.org/10.1109/IC2E48712.2020.00020","url":null,"abstract":"Blockchain, as a promising solution to develop secure distributed ledgers, has drawn a huge attention over the last decade. By introducing a pseudonymous payment model with no central authority, blockchain marked the new generation of online payment systems, known as Cryptocurrencies. For most of the existing cryptocurrencies, scalability has become a challenging problem. When dealing with an ever increasing number of users, miners, and transactions, the technology is unable to scale and provide the same performance as centralised systems (e.g. centralised payment systems).Without addressing this fundamental scalability problem, such a promising technology may not be able to be adopted in mainstream. This paper provides an attempt to analyse the scalability of existing blockchain protocols and look at the major factors affecting scalability, namely throughput and latency. We also describe the HTNZ protocol, a new approach to improve the scalability of Satoshi Nakamoto’s model [1], validated by experimental results. HTNZ introduces two new components, namely, sideBlock and helper. SideBlock has a slightly different structure of block and increases the number of transactions that can be processed per each interval.","PeriodicalId":173494,"journal":{"name":"2020 IEEE International Conference on Cloud Engineering (IC2E)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124190190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MDINFERENCE: Balancing Inference Accuracy and Latency for Mobile Applications","authors":"Samuel S. Ogden, Tian Guo","doi":"10.1109/IC2E48712.2020.00010","DOIUrl":"https://doi.org/10.1109/IC2E48712.2020.00010","url":null,"abstract":"Deep Neural Networks are allowing mobile devices to incorporate a wide range of features into user applications. However, the computational complexity of these models makes it difficult to run them effectively on resource-constrained mobile devices. Prior work approached the problem of supporting deep learning in mobile applications by either decreasing model complexity or utilizing powerful cloud servers. These approaches each only focus on a single aspect of mobile inference and thus they often sacrifice overall performance.In this work we introduce a holistic approach to designing mobile deep inference frameworks. We first identify the key goals of accuracy and latency for mobile deep inference and the conditions that must be met to achieve them. We demonstrate our holistic approach through the design of a hypothetical framework called MDINFERENCE. This framework leverages two complementary techniques; a model selection algorithm that chooses from a set of cloud-based deep learning models to improve inference accuracy and an on-device request duplication mechanism to bound latency. Through empirically-driven simulations we show that MDINFERENCE improves aggregate accuracy over static approaches by over 40% without incurring SLA violations. Additionally, we show that with a target latency of 250ms, MDINFERENCE increased the aggregate accuracy in 99.74% cases on faster university networks and 96.84% cases on residential networks.","PeriodicalId":173494,"journal":{"name":"2020 IEEE International Conference on Cloud Engineering (IC2E)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116164418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PERSEUS: Characterizing Performance and Cost of Multi-Tenant Serving for CNN Models","authors":"Matthew LeMay, Shijian Li, Tian Guo","doi":"10.1109/IC2E48712.2020.00014","DOIUrl":"https://doi.org/10.1109/IC2E48712.2020.00014","url":null,"abstract":"Deep learning models are increasingly used for end-user applications, supporting both novel features such as facial recognition, and traditional features, e.g. web search. To accommodate high inference throughput, it is common to host a single pre-trained Convolutional Neural Network (CNN) in dedicated cloud-based servers with hardware accelerators such as Graphics Processing Units (GPUs). However, GPUs can be orders of magnitude more expensive than traditional Central Processing Unit (CPU) servers. These resources could also be under-utilized facing dynamic workloads, which may result in inflated serving costs. One potential way to alleviate this problem is by allowing hosted models to share the underlying resources, which we refer to as multi-tenant inference serving. One of the key challenges is maximizing the resource efficiency for multi-tenant serving given hardware with diverse characteristics, models with unique response time Service Level Agreement (SLA), and dynamic inference workloads. In this paper, we present PERSEUS, a measurement framework that provides the basis for understanding the performance and cost trade-offs of multi-tenant model serving. We implemented PERSEUS in Python atop a popular cloud inference server called Nvidia TensorRT Inference Server. Leveraging PERSEUS, we evaluated the inference throughput and cost for serving various models and demonstrated that multi-tenant model serving led to up to 12% cost reduction.","PeriodicalId":173494,"journal":{"name":"2020 IEEE International Conference on Cloud Engineering (IC2E)","volume":"316 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123612162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}