2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)最新文献

筛选
英文 中文
TrimTuner: Efficient Optimization of Machine Learning Jobs in the Cloud via Sub-Sampling TrimTuner:通过子采样在云端高效优化机器学习作业
Pedro Mendes, Maria Casimiro, P. Romano, D. Garlan
{"title":"TrimTuner: Efficient Optimization of Machine Learning Jobs in the Cloud via Sub-Sampling","authors":"Pedro Mendes, Maria Casimiro, P. Romano, D. Garlan","doi":"10.1109/MASCOTS50786.2020.9285971","DOIUrl":"https://doi.org/10.1109/MASCOTS50786.2020.9285971","url":null,"abstract":"This work introduces TrimTuner, the first system for optimizing machine learning jobs in the cloud to exploit sub-sampling techniques to reduce the cost of the optimization process, while keeping into account user-specified constraints. TrimTuner jointly optimizes the cloud and application-specific parameters and, unlike state of the art works for cloud optimization, eschews the need to train the model with the full training set every time a new configuration is sampled. Indeed, by leveraging sub-sampling techniques and data-sets that are up to 60 x smaller than the original one, we show that TrimTuner can reduce the cost of the optimization process by up to 50 x. Further, TrimTuner speeds-up the recommendation process by 65 x with respect to state of the art techniques for hyperparameter optimization that use sub-sampling techniques. The reasons for this improvement are twofold: i) a novel domain specific heuristic that reduces the number of configurations for which the acquisition function has to be evaluated; ii) the adoption of an ensemble of decision trees that enables boosting the speed of the recommendation process by one additional order of magnitude.","PeriodicalId":272614,"journal":{"name":"2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134598993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
COCOA: Cold Start Aware Capacity Planning for Function-as-a-Service Platforms 功能即服务平台的冷启动感知容量规划
Alim Ul Gias, G. Casale
{"title":"COCOA: Cold Start Aware Capacity Planning for Function-as-a-Service Platforms","authors":"Alim Ul Gias, G. Casale","doi":"10.1109/MASCOTS50786.2020.9285966","DOIUrl":"https://doi.org/10.1109/MASCOTS50786.2020.9285966","url":null,"abstract":"Function-as-a-Service (FaaS) has become increasingly popular in the software industry due to the implied cost-savings in event-driven workloads and its synergy with DevOps. To size an on-premise FaaS platform, it is important to estimate the required CPU and memory capacity to serve the expected loads. Given the service-level agreements, it is however challenging to take the cold start issue into account during the sizing process. We have investigated the similarity of this problem with the hit rate improvement problem in Time to Live (TTL) caches and concluded that solutions for TTL cache, although potentially applicable, lead to over-provisioning in FaaS. Thus, we propose a novel approach, COCOA, to solve this issue. COCOA uses a queueing-based approach to assess the effect of cold starts on FaaS response times. It also considers different memory consumption values depending on whether the function is idle or in execution. Using an event-driven FaaS simulator, FaasSim, that we have developed, we show that COCOA can reduce overprovisioning by over 70% under some of the workloads we have considered, while satisfying the service-level agreements.","PeriodicalId":272614,"journal":{"name":"2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128713224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Effective Elastic Scaling of Deep Learning Workloads 深度学习工作负载的有效弹性扩展
Vaibhav Saxena, K. R. Jayaram, Saurav Basu, Yogish Sabharwal, Ashish Verma
{"title":"Effective Elastic Scaling of Deep Learning Workloads","authors":"Vaibhav Saxena, K. R. Jayaram, Saurav Basu, Yogish Sabharwal, Ashish Verma","doi":"10.1109/MASCOTS50786.2020.9285954","DOIUrl":"https://doi.org/10.1109/MASCOTS50786.2020.9285954","url":null,"abstract":"We examine the elastic scaling of Deep Learning (DL) jobs and propose a novel resource allocation strategy for DL training jobs, resulting in improved job run time performance as well as increased cluster utilization. We begin by analyzing DL workloads and exploit the fact that DL jobs can be run with a range of batch sizes without affecting their final accuracy. We formulate an optimization problem that explores a dynamic batch size allocation to individual DL jobs based on their scaling efficiency, when running on multiple nodes. We design a fast dynamic programming based optimizer to solve this problem in real-time to determine jobs that can be scaled up/down, and use this optimizer in an autoscaler to dynamically change the allocated resources and batch sizes of individual DL jobs. We demonstrate empirically that our elastic scaling algorithm can complete up to as many jobs as compared to a strong baseline algorithm that also scales the number of GPUs but does not change the batch size, with average completion times up to faster.","PeriodicalId":272614,"journal":{"name":"2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122184131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Smart Background Scheduler for Storage Systems 存储系统的智能后台调度程序
Maher Kachmar, D. Kaeli
{"title":"A Smart Background Scheduler for Storage Systems","authors":"Maher Kachmar, D. Kaeli","doi":"10.1109/MASCOTS50786.2020.9285967","DOIUrl":"https://doi.org/10.1109/MASCOTS50786.2020.9285967","url":null,"abstract":"In today's enterprise storage systems, supported data services such as snapshot delete or drive rebuild can result in tremendous performance overhead if executed inline along with heavy foreground IO, often leading to missing Service Level Objectives (SLOs). Typical storage system applications such as Virtual Desktop Infrastructure (VDI) or web services follow a repetitive high/low workload pattern that can be learned and forecasted. We propose a priority-based background scheduler that learns this pattern and allows storage systems to maintain peak performance and meet service level objectives (SLOs) while supporting a number of data services. When foreground IO demand intensifies, system resources are dedicated to service foreground IO requests and any background processing that can be deferred are recorded to be processed in future idle cycles as long as our forecaster predicts that the storage pool has remaining capacity. The smart background scheduler adopts a resource partitioning model that allows both foreground and background IO to execute together as long as foreground IOs are not impacted, harnessing any free cycles to clear background debt. Using traces from VDI and web services applications, we show how our technique can out-perform a static policy that sets fixed limits on the deferred background debt and reduces SLO violations from 54.6% (when using a fixed background debt watermark), to only 6.2 % when dynamically adjusted by our smart background scheduler.","PeriodicalId":272614,"journal":{"name":"2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128368277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Age of Information in an Overtake- Free Network of Quasi - Reversible Queues 准可逆队列超车网络中的信息时代
I. Koukoutsidis
{"title":"Age of Information in an Overtake- Free Network of Quasi - Reversible Queues","authors":"I. Koukoutsidis","doi":"10.1109/MASCOTS50786.2020.9285958","DOIUrl":"https://doi.org/10.1109/MASCOTS50786.2020.9285958","url":null,"abstract":"We show how to calculate the Age of Information in an overtake-free network of quasi-reversible queues, with exponential exogenous interarrivals of multiple classes of update packets and exponential service times at all nodes. Results are provided for any number of M/M/1 First-Come-First-Served (FCFS) queues in tandem, and for a network with two classes of update packets, entering through different queues in the network and exiting through the same queue. The main takeaway is that in a network with different classes of update packets, individual classes roughly preserve the ages they would achieve if they were alone in the network, except when shared queues become saturated, in which case the ages increase considerably. The results are extensible for other quasi-reversible queues for which sojourn time distributions are known, such as M/M/c FCFS queues and processor-sharing queues.","PeriodicalId":272614,"journal":{"name":"2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)","volume":"9 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131436457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信