Mengzhi Wang, Kinman Au, A. Ailamaki, A. Brockwell, C. Faloutsos, G. Ganger
{"title":"Storage device performance prediction with CART models","authors":"Mengzhi Wang, Kinman Au, A. Ailamaki, A. Brockwell, C. Faloutsos, G. Ganger","doi":"10.1145/1005686.1005743","DOIUrl":null,"url":null,"abstract":"Storage device performance prediction is a key element of self-managed storage systems. The paper explores the application of a machine learning tool, CART (classification and regression trees) models, to storage device modeling. Our approach predicts a device's performance as a function of input workloads, requiring no knowledge of the device internals. We propose two uses of CART models: one that predicts per-request response times (and then derives aggregate values); one that predicts aggregate values directly from workload characteristics. After being trained on the device in question, both provide accurate black-box models across a range of test traces from real environments. Experiments show that these models predict the average and 90th percentile response time with a relative error as low as 19%, when the training workloads are similar to the testing workloads, and interpolate well across different workloads.","PeriodicalId":32394,"journal":{"name":"Performance","volume":"33 1","pages":"588-595"},"PeriodicalIF":0.0000,"publicationDate":"2004-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"175","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1005686.1005743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 175
Abstract
Storage device performance prediction is a key element of self-managed storage systems. The paper explores the application of a machine learning tool, CART (classification and regression trees) models, to storage device modeling. Our approach predicts a device's performance as a function of input workloads, requiring no knowledge of the device internals. We propose two uses of CART models: one that predicts per-request response times (and then derives aggregate values); one that predicts aggregate values directly from workload characteristics. After being trained on the device in question, both provide accurate black-box models across a range of test traces from real environments. Experiments show that these models predict the average and 90th percentile response time with a relative error as low as 19%, when the training workloads are similar to the testing workloads, and interpolate well across different workloads.