Aaron J. Elmore, Sudipto Das, A. Pucher, D. Agrawal, A. E. Abbadi, Xifeng Yan
{"title":"Characterizing tenant behavior for placement and crisis mitigation in multitenant DBMSs","authors":"Aaron J. Elmore, Sudipto Das, A. Pucher, D. Agrawal, A. E. Abbadi, Xifeng Yan","doi":"10.1145/2463676.2465308","DOIUrl":null,"url":null,"abstract":"A multitenant database management system (DBMS) in the cloud must continuously monitor the trade-off between efficient resource sharing among multiple application databases (tenants) and their performance. Considering the scale of \\attn{hundreds to} thousands of tenants in such multitenant DBMSs, manual approaches for continuous monitoring are not tenable. A self-managing controller of a multitenant DBMS faces several challenges. For instance, how to characterize a tenant given its variety of workloads, how to reduce the impact of tenant colocation, and how to detect and mitigate a performance crisis where one or more tenants' desired service level objective (SLO) is not achieved.\n We present Delphi, a self-managing system controller for a multitenant DBMS, and Pythia, a technique to learn behavior through observation and supervision using DBMS-agnostic database level performance measures. Pythia accurately learns tenant behavior even when multiple tenants share a database process, learns good and bad tenant consolidation plans (or packings), and maintains a pertenant history to detect behavior changes. Delphi detects performance crises, and leverages Pythia to suggests remedial actions using a hill-climbing search algorithm to identify a new tenant placement strategy to mitigate violating SLOs. Our evaluation using a variety of tenant types and workloads shows that Pythia can learn a tenant's behavior with more than 92% accuracy and learn the quality of packings with more than 86% accuracy. During a performance crisis, Delphi is able to reduce 99th percentile latencies by 80%, and can consolidate 45% more tenants than a greedy baseline, which balances tenant load without modeling tenant behavior.","PeriodicalId":87344,"journal":{"name":"Proceedings. ACM-SIGMOD International Conference on Management of Data","volume":"9 1","pages":"517-528"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. ACM-SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463676.2465308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 50
Abstract
A multitenant database management system (DBMS) in the cloud must continuously monitor the trade-off between efficient resource sharing among multiple application databases (tenants) and their performance. Considering the scale of \attn{hundreds to} thousands of tenants in such multitenant DBMSs, manual approaches for continuous monitoring are not tenable. A self-managing controller of a multitenant DBMS faces several challenges. For instance, how to characterize a tenant given its variety of workloads, how to reduce the impact of tenant colocation, and how to detect and mitigate a performance crisis where one or more tenants' desired service level objective (SLO) is not achieved.
We present Delphi, a self-managing system controller for a multitenant DBMS, and Pythia, a technique to learn behavior through observation and supervision using DBMS-agnostic database level performance measures. Pythia accurately learns tenant behavior even when multiple tenants share a database process, learns good and bad tenant consolidation plans (or packings), and maintains a pertenant history to detect behavior changes. Delphi detects performance crises, and leverages Pythia to suggests remedial actions using a hill-climbing search algorithm to identify a new tenant placement strategy to mitigate violating SLOs. Our evaluation using a variety of tenant types and workloads shows that Pythia can learn a tenant's behavior with more than 92% accuracy and learn the quality of packings with more than 86% accuracy. During a performance crisis, Delphi is able to reduce 99th percentile latencies by 80%, and can consolidate 45% more tenants than a greedy baseline, which balances tenant load without modeling tenant behavior.