Gecko: Guaranteeing Latency SLO on a Multi-Tenant Distributed Storage System

Z. Leng, D. Jiang, Liuying Ma, Jin Xiong
{"title":"Gecko: Guaranteeing Latency SLO on a Multi-Tenant Distributed Storage System","authors":"Z. Leng, D. Jiang, Liuying Ma, Jin Xiong","doi":"10.1109/ICPADS51040.2020.00051","DOIUrl":null,"url":null,"abstract":"Meeting tail latency Service Level Objective (SLO) as well as achieving high resource utilization is important to distributed storage systems. Recent works adopt strict priority scheduling or constant rate limiting to provide SLO guarantee but cause under-utilization resources. To address this issue, we first analyze the relationship between workload burst and latency SLO. Based on burst patterns and latency SLOs, we classify tenants into two categories: Postponement-Tolerable tenant and Postponement-Intolerable tenant. We then explore the opportunity to improve resource utilization by carefully allocating resources to each tenant type. We design Rate-Limiting-Priority scheduling algorithm to limit the impact of high priority tenants on low priority ones. Meanwhile, we propose Postponement-Aware scheduling algorithm which allows Postponement-Intolerable tenants to preempt system capacity from Postponement-Tolerable tenants. This helps to increase resource utilization. We propose a latency SLO guarantee framework Gecko. Gecko guarantees multi-tenant latency SLOs via combining the two proposed scheduling algorithms together with an admission control strategy. We evaluate Gecko with real production traces and the results show that Gecko admits 44% more tenants on average than state-of-the-art techniques meanwhile guaranteeing latency SLO.","PeriodicalId":196548,"journal":{"name":"2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS)","volume":"81 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPADS51040.2020.00051","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Meeting tail latency Service Level Objective (SLO) as well as achieving high resource utilization is important to distributed storage systems. Recent works adopt strict priority scheduling or constant rate limiting to provide SLO guarantee but cause under-utilization resources. To address this issue, we first analyze the relationship between workload burst and latency SLO. Based on burst patterns and latency SLOs, we classify tenants into two categories: Postponement-Tolerable tenant and Postponement-Intolerable tenant. We then explore the opportunity to improve resource utilization by carefully allocating resources to each tenant type. We design Rate-Limiting-Priority scheduling algorithm to limit the impact of high priority tenants on low priority ones. Meanwhile, we propose Postponement-Aware scheduling algorithm which allows Postponement-Intolerable tenants to preempt system capacity from Postponement-Tolerable tenants. This helps to increase resource utilization. We propose a latency SLO guarantee framework Gecko. Gecko guarantees multi-tenant latency SLOs via combining the two proposed scheduling algorithms together with an admission control strategy. We evaluate Gecko with real production traces and the results show that Gecko admits 44% more tenants on average than state-of-the-art techniques meanwhile guaranteeing latency SLO.
Gecko:多租户分布式存储系统时延SLO保障
满足尾部延迟服务水平目标(SLO)和实现高资源利用率是分布式存储系统的重要组成部分。近期工程采用严格的优先级调度或恒速率限制,以提供SLO保证,但造成资源利用率不足。为了解决这个问题,我们首先分析工作负载突发和延迟SLO之间的关系。基于突发模式和延迟slo,我们将租户分为两类:延迟可容忍的租户和延迟不可容忍的租户。然后,我们将通过仔细地为每个租户类型分配资源来探索提高资源利用率的机会。我们设计了限速优先级调度算法来限制高优先级租户对低优先级租户的影响。同时,我们提出了延迟感知调度算法,该算法允许延迟不可容忍租户抢占延迟可容忍租户的系统容量。这有助于提高资源利用率。我们提出一个延迟SLO保证框架Gecko。Gecko通过将两种提出的调度算法与准入控制策略结合在一起来保证多租户延迟slo。我们用真实的生产轨迹对Gecko进行了评估,结果表明Gecko比最先进的技术平均多接纳44%的租户,同时保证了延迟SLO。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信