Abed Abu Dbai, David Breitgand, G. Gershinsky, A. Glikson, K. Ahmed
{"title":"Mesos集群中的企业资源管理","authors":"Abed Abu Dbai, David Breitgand, G. Gershinsky, A. Glikson, K. Ahmed","doi":"10.1145/2928275.2933272","DOIUrl":null,"url":null,"abstract":"Enterprise data centers increasingly adopt a cloud-like architecture that enables the execution of multiple workloads on a shared pool of resources, reduces the data center footprint and drives down the costs. A number of cluster resource managers have appeared over the last few years, aimed at providing a uniform technology-neutral resource representation and management substrate. Examples include Apache YARN, Google Borg and Omega, Apache Mesos, and IBM Platform EGO. The Apache Mesos project [2] is emerging as a leading open source resource management technology for server clusters. Mesos offers simple yet powerful and flexible APIs, highly available and fault tolerant architecture, scalability to large clusters, isolation between tasks using Linux containers, multi-dimensional resource scheduling, ability to allocate shares of the cluster to roles representing users or user groups, and a clear separation of concerns between the applications (termed frameworks) and the \"cluster kernel\", which is Mesos. The resource scheduler of Mesos supports a generalization of max-min fairness, termed Dominant Resource Fairness (DRF) [1] scheduling discipline, which allows to harmonize execution of heterogeneous workloads (in terms of resource demand) by maximizing the share of any resource allocated to a specific framework. However, the default Mesos allocation mechanism lacks a number of policy and tenancy capabilities, important in enterprise deployments. We have investigated integration of Mesos with the IBM EGO (enterprise grid orchestrator) technology [3] which underpins various high performance computing, analytics and big data clusters in a variety of industry verticals including financial services, life sciences, manufacturing and electronics. We have designed and implemented an experimental integration prototype, and have tested it with SparkBench workloads. We demonstrate how Mesos can be enriched with new resource policy capabilities, required for managing enterprise data centers, such as • Capturing of the hierarchical structure of an enterprise (organisations, departments, groups, teams, users) by defining the corresponding resource consumer tree; • A fine grained resource plan allowing to define resource share ratio, ownership and lending/borrowing policies for each resource consumer; • A rich set of resource management policies making use of the hierarchical resource consumer model and providing fairness and isolation to the members of hierarchy including an important ability to dynamically change the allocations (time-based policy); • A Web-based GUI providing a centralized console through which the whole cluster is observed and managed. In particular, the cluster-wide resource management policies are applied through this GUI.","PeriodicalId":20607,"journal":{"name":"Proceedings of the 9th ACM International on Systems and Storage Conference","volume":"30 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Enterprise Resource Management in Mesos Clusters\",\"authors\":\"Abed Abu Dbai, David Breitgand, G. Gershinsky, A. Glikson, K. Ahmed\",\"doi\":\"10.1145/2928275.2933272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Enterprise data centers increasingly adopt a cloud-like architecture that enables the execution of multiple workloads on a shared pool of resources, reduces the data center footprint and drives down the costs. A number of cluster resource managers have appeared over the last few years, aimed at providing a uniform technology-neutral resource representation and management substrate. Examples include Apache YARN, Google Borg and Omega, Apache Mesos, and IBM Platform EGO. The Apache Mesos project [2] is emerging as a leading open source resource management technology for server clusters. Mesos offers simple yet powerful and flexible APIs, highly available and fault tolerant architecture, scalability to large clusters, isolation between tasks using Linux containers, multi-dimensional resource scheduling, ability to allocate shares of the cluster to roles representing users or user groups, and a clear separation of concerns between the applications (termed frameworks) and the \\\"cluster kernel\\\", which is Mesos. The resource scheduler of Mesos supports a generalization of max-min fairness, termed Dominant Resource Fairness (DRF) [1] scheduling discipline, which allows to harmonize execution of heterogeneous workloads (in terms of resource demand) by maximizing the share of any resource allocated to a specific framework. However, the default Mesos allocation mechanism lacks a number of policy and tenancy capabilities, important in enterprise deployments. We have investigated integration of Mesos with the IBM EGO (enterprise grid orchestrator) technology [3] which underpins various high performance computing, analytics and big data clusters in a variety of industry verticals including financial services, life sciences, manufacturing and electronics. We have designed and implemented an experimental integration prototype, and have tested it with SparkBench workloads. We demonstrate how Mesos can be enriched with new resource policy capabilities, required for managing enterprise data centers, such as • Capturing of the hierarchical structure of an enterprise (organisations, departments, groups, teams, users) by defining the corresponding resource consumer tree; • A fine grained resource plan allowing to define resource share ratio, ownership and lending/borrowing policies for each resource consumer; • A rich set of resource management policies making use of the hierarchical resource consumer model and providing fairness and isolation to the members of hierarchy including an important ability to dynamically change the allocations (time-based policy); • A Web-based GUI providing a centralized console through which the whole cluster is observed and managed. In particular, the cluster-wide resource management policies are applied through this GUI.\",\"PeriodicalId\":20607,\"journal\":{\"name\":\"Proceedings of the 9th ACM International on Systems and Storage Conference\",\"volume\":\"30 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 9th ACM International on Systems and Storage Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2928275.2933272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th ACM International on Systems and Storage Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2928275.2933272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enterprise data centers increasingly adopt a cloud-like architecture that enables the execution of multiple workloads on a shared pool of resources, reduces the data center footprint and drives down the costs. A number of cluster resource managers have appeared over the last few years, aimed at providing a uniform technology-neutral resource representation and management substrate. Examples include Apache YARN, Google Borg and Omega, Apache Mesos, and IBM Platform EGO. The Apache Mesos project [2] is emerging as a leading open source resource management technology for server clusters. Mesos offers simple yet powerful and flexible APIs, highly available and fault tolerant architecture, scalability to large clusters, isolation between tasks using Linux containers, multi-dimensional resource scheduling, ability to allocate shares of the cluster to roles representing users or user groups, and a clear separation of concerns between the applications (termed frameworks) and the "cluster kernel", which is Mesos. The resource scheduler of Mesos supports a generalization of max-min fairness, termed Dominant Resource Fairness (DRF) [1] scheduling discipline, which allows to harmonize execution of heterogeneous workloads (in terms of resource demand) by maximizing the share of any resource allocated to a specific framework. However, the default Mesos allocation mechanism lacks a number of policy and tenancy capabilities, important in enterprise deployments. We have investigated integration of Mesos with the IBM EGO (enterprise grid orchestrator) technology [3] which underpins various high performance computing, analytics and big data clusters in a variety of industry verticals including financial services, life sciences, manufacturing and electronics. We have designed and implemented an experimental integration prototype, and have tested it with SparkBench workloads. We demonstrate how Mesos can be enriched with new resource policy capabilities, required for managing enterprise data centers, such as • Capturing of the hierarchical structure of an enterprise (organisations, departments, groups, teams, users) by defining the corresponding resource consumer tree; • A fine grained resource plan allowing to define resource share ratio, ownership and lending/borrowing policies for each resource consumer; • A rich set of resource management policies making use of the hierarchical resource consumer model and providing fairness and isolation to the members of hierarchy including an important ability to dynamically change the allocations (time-based policy); • A Web-based GUI providing a centralized console through which the whole cluster is observed and managed. In particular, the cluster-wide resource management policies are applied through this GUI.