Tromino: Demand and DRF Aware Multi-Tenant Queue Manager for Apache Mesos Cluster

Pankaj Saha, Angel Beltre, M. Govindaraju
{"title":"Tromino: Demand and DRF Aware Multi-Tenant Queue Manager for Apache Mesos Cluster","authors":"Pankaj Saha, Angel Beltre, M. Govindaraju","doi":"10.1109/UCC.2018.00015","DOIUrl":null,"url":null,"abstract":"Apache Mesos, a two-level resource scheduler, provides resource sharing across multiple users in a multi-tenant clustered environment. Computational resources (i.e., CPU, memory, disk, etc.) are distributed according to the Dominant Resource Fairness (DRF) policy. Mesos frameworks (users) receive resources based on their current usage and are responsible for scheduling their tasks within the allocation. We have observed that multiple frameworks can cause fairness imbalance in a multi-user environment. For example, a greedy framework consuming more than its fair share of resources can deny resource fairness to others. The user with the least Dominant Share is considered first by the DRF module to get its resource allocation. However, the default DRF implementation, in Apache Mesos' Master allocation module, does not consider the overall resource demands of the tasks in the queue for each user/framework. This lack of awareness can lead to poor performance as users without any pending task may receive more resource offers, and users with a queue of pending tasks can starve due to their high dominant shares. In a multi-tenant environment, the characteristics of frameworks and workloads must be understood by cluster managers to be able to define fairness based on not only resource share but also resource demand and queue wait time. We have developed a policy driven queue manager, Tromino, for an Apache Mesos cluster where tasks for individual frameworks can be scheduled based on each framework's overall resource demands and current resource consumption. Dominant Share and demand awareness of Tromino and scheduling based on these attributes can reduce (1) the impact of unfairness due to a framework specific configuration, and (2) unfair waiting time due to higher resource demand in a pending task queue. In the best case, Tromino can significantly reduce the average waiting time of a framework by using the proposed Demand-DRF aware policy.","PeriodicalId":288232,"journal":{"name":"2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/ACM 11th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC.2018.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Apache Mesos, a two-level resource scheduler, provides resource sharing across multiple users in a multi-tenant clustered environment. Computational resources (i.e., CPU, memory, disk, etc.) are distributed according to the Dominant Resource Fairness (DRF) policy. Mesos frameworks (users) receive resources based on their current usage and are responsible for scheduling their tasks within the allocation. We have observed that multiple frameworks can cause fairness imbalance in a multi-user environment. For example, a greedy framework consuming more than its fair share of resources can deny resource fairness to others. The user with the least Dominant Share is considered first by the DRF module to get its resource allocation. However, the default DRF implementation, in Apache Mesos' Master allocation module, does not consider the overall resource demands of the tasks in the queue for each user/framework. This lack of awareness can lead to poor performance as users without any pending task may receive more resource offers, and users with a queue of pending tasks can starve due to their high dominant shares. In a multi-tenant environment, the characteristics of frameworks and workloads must be understood by cluster managers to be able to define fairness based on not only resource share but also resource demand and queue wait time. We have developed a policy driven queue manager, Tromino, for an Apache Mesos cluster where tasks for individual frameworks can be scheduled based on each framework's overall resource demands and current resource consumption. Dominant Share and demand awareness of Tromino and scheduling based on these attributes can reduce (1) the impact of unfairness due to a framework specific configuration, and (2) unfair waiting time due to higher resource demand in a pending task queue. In the best case, Tromino can significantly reduce the average waiting time of a framework by using the proposed Demand-DRF aware policy.
Tromino:用于Apache Mesos集群的需求和DRF感知的多租户队列管理器
Apache Mesos是一个两级资源调度器,在多租户集群环境中提供跨多个用户的资源共享。计算资源(CPU、内存、磁盘等)按照DRF (Dominant Resource Fairness)策略进行分配。Mesos框架(用户)根据其当前使用情况接收资源,并负责在分配范围内调度其任务。我们观察到,在多用户环境中,多个框架可能导致公平性失衡。例如,一个贪婪的框架消耗超过其公平份额的资源,可能会拒绝对其他框架的资源公平。DRF模块首先考虑占主导份额最少的用户,以获得其资源分配。然而,在Apache Mesos的Master分配模块中,默认的DRF实现并不考虑队列中每个用户/框架的任务的总体资源需求。这种意识的缺乏可能会导致性能不佳,因为没有任何挂起任务的用户可能会收到更多的资源提供,而具有挂起任务队列的用户可能会因其较高的主导份额而挨饿。在多租户环境中,集群管理器必须理解框架和工作负载的特征,以便能够不仅基于资源共享,还基于资源需求和队列等待时间来定义公平性。我们为Apache Mesos集群开发了一个策略驱动的队列管理器Tromino,可以根据每个框架的总体资源需求和当前资源消耗来调度各个框架的任务。Tromino的支配份额和需求意识以及基于这些属性的调度可以减少(1)由于框架特定配置而导致的不公平影响,以及(2)由于待处理任务队列中较高的资源需求而导致的不公平等待时间。在最好的情况下,Tromino可以通过使用建议的需求- drf感知策略显着减少框架的平均等待时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信