自主系统集合的自动管理

Thomas J. Glazier, D. Garlan, B. Schmerl
{"title":"自主系统集合的自动管理","authors":"Thomas J. Glazier, D. Garlan, B. Schmerl","doi":"10.1109/ACSOS49614.2020.00029","DOIUrl":null,"url":null,"abstract":"Many applications have taken advantage of cloud provided autonomic capabilities, commonly auto-scaling, to harness easily available compute capacity to maintain performance against defined quality objectives. This has caused the management complexity of enterprise applications to increase. It is now common for an application to be a collection of autonomic sub-systems. Combining individual autonomic systems to create an application can lead to behaviors that negatively impact the global aggregate utility of the application and in some cases can be conflicting and self-destructive. Commonly, human administrators address these behaviors as part of a design time analysis of the situation or a run time mitigation of the undesired effects. However, the task of controlling and mitigating undesirable behaviors is complex and error prone. To handle the complexity of managing a collection of autonomic systems we have previously proposed an automated approach to the creation of a higher level autonomic management system, referred to as a Meta-Manager. In this paper, we improve upon prior work with a more streamlined and understandable formal representation of the approach, expand its capabilities to include global knowledge, and test its potential applicability and effectiveness by managing the complexity of a collection of autonomic systems in a case study of a major outage suffered by the Google Cloud Platform.","PeriodicalId":310362,"journal":{"name":"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Automated Management of Collections of Autonomic Systems\",\"authors\":\"Thomas J. Glazier, D. Garlan, B. Schmerl\",\"doi\":\"10.1109/ACSOS49614.2020.00029\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many applications have taken advantage of cloud provided autonomic capabilities, commonly auto-scaling, to harness easily available compute capacity to maintain performance against defined quality objectives. This has caused the management complexity of enterprise applications to increase. It is now common for an application to be a collection of autonomic sub-systems. Combining individual autonomic systems to create an application can lead to behaviors that negatively impact the global aggregate utility of the application and in some cases can be conflicting and self-destructive. Commonly, human administrators address these behaviors as part of a design time analysis of the situation or a run time mitigation of the undesired effects. However, the task of controlling and mitigating undesirable behaviors is complex and error prone. To handle the complexity of managing a collection of autonomic systems we have previously proposed an automated approach to the creation of a higher level autonomic management system, referred to as a Meta-Manager. In this paper, we improve upon prior work with a more streamlined and understandable formal representation of the approach, expand its capabilities to include global knowledge, and test its potential applicability and effectiveness by managing the complexity of a collection of autonomic systems in a case study of a major outage suffered by the Google Cloud Platform.\",\"PeriodicalId\":310362,\"journal\":{\"name\":\"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACSOS49614.2020.00029\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACSOS49614.2020.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

许多应用程序已经利用了云提供的自主功能(通常是自动伸缩)来利用容易获得的计算能力,以根据定义的质量目标来维护性能。这导致企业应用程序的管理复杂性增加。现在,应用程序通常是自治子系统的集合。将单个自治系统组合在一起创建应用程序可能会导致对应用程序的全局聚合效用产生负面影响的行为,并且在某些情况下可能会发生冲突和自我毁灭。通常,人工管理员会将这些行为作为设计时情况分析的一部分,或者作为运行时减轻不良影响的一部分来处理。然而,控制和减轻不良行为的任务是复杂且容易出错的。为了处理管理自治系统集合的复杂性,我们之前提出了一种自动化的方法来创建更高级别的自治管理系统,称为元管理器。在本文中,我们改进了之前的工作,对该方法进行了更精简和可理解的形式化表示,扩展了其功能,以包括全球知识,并通过管理自主系统集合的复杂性来测试其潜在的适用性和有效性,该案例研究了谷歌云平台遭受的重大中断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated Management of Collections of Autonomic Systems
Many applications have taken advantage of cloud provided autonomic capabilities, commonly auto-scaling, to harness easily available compute capacity to maintain performance against defined quality objectives. This has caused the management complexity of enterprise applications to increase. It is now common for an application to be a collection of autonomic sub-systems. Combining individual autonomic systems to create an application can lead to behaviors that negatively impact the global aggregate utility of the application and in some cases can be conflicting and self-destructive. Commonly, human administrators address these behaviors as part of a design time analysis of the situation or a run time mitigation of the undesired effects. However, the task of controlling and mitigating undesirable behaviors is complex and error prone. To handle the complexity of managing a collection of autonomic systems we have previously proposed an automated approach to the creation of a higher level autonomic management system, referred to as a Meta-Manager. In this paper, we improve upon prior work with a more streamlined and understandable formal representation of the approach, expand its capabilities to include global knowledge, and test its potential applicability and effectiveness by managing the complexity of a collection of autonomic systems in a case study of a major outage suffered by the Google Cloud Platform.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信