Yunqi Zhang, M. Laurenzano, Jason Mars, Lingjia Tang
{"title":"SMiTe: Precise QoS Prediction on Real-System SMT Processors to Improve Utilization in Warehouse Scale Computers","authors":"Yunqi Zhang, M. Laurenzano, Jason Mars, Lingjia Tang","doi":"10.1109/MICRO.2014.53","DOIUrl":null,"url":null,"abstract":"One of the key challenges for improving efficiency in warehouse scale computers (WSCs) is to improve server utilization while guaranteeing the quality of service (QoS) of latency-sensitive applications. To this end, prior work has proposed techniques to precisely predict performance and QoS interference to identify 'safe' application co-locations. However, such techniques are only applicable to resources shared across cores. Achieving such precise interference prediction on real-system simultaneous multithreading (SMT) architectures has been a significantly challenging open problem due to the complexity introduced by sharing resources within a core. In this paper, we demonstrate through a real-system investigation that the fundamental difference between resource sharing behaviors on CMP and SMT architectures calls for a redesign of the way we model interference. For SMT servers, the interference on different shared resources, including private caches, memory ports, as well as integer and floating-point functional units, do not correlate with each other. This insight suggests the necessity of decoupling interference into multiple resource sharing dimensions. In this work, we propose SMiTe, a methodology that enables precise performance prediction for SMT co-location on real-system commodity processors. With a set of Rulers, which are carefully designed software stressors that apply pressure to a multidimensional space of shared resources, we quantify application sensitivity and contentiousness in a decoupled manner. We then establish a regression model to combine the sensitivity and contentiousness in different dimensions to predict performance interference. Using this methodology, we are able to precisely predict the performance interference in SMT co-location with an average error of 2.80% on SPEC CPU2006 and 1.79% on Cloud Suite. Our evaluation shows that SMiTe allows us to improve the utilization of WSCs by up to 42.57% while enforcing an application's QoS requirements.","PeriodicalId":6591,"journal":{"name":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","volume":"267 1","pages":"406-418"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"128","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 47th Annual IEEE/ACM International Symposium on Microarchitecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MICRO.2014.53","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 128
Abstract
One of the key challenges for improving efficiency in warehouse scale computers (WSCs) is to improve server utilization while guaranteeing the quality of service (QoS) of latency-sensitive applications. To this end, prior work has proposed techniques to precisely predict performance and QoS interference to identify 'safe' application co-locations. However, such techniques are only applicable to resources shared across cores. Achieving such precise interference prediction on real-system simultaneous multithreading (SMT) architectures has been a significantly challenging open problem due to the complexity introduced by sharing resources within a core. In this paper, we demonstrate through a real-system investigation that the fundamental difference between resource sharing behaviors on CMP and SMT architectures calls for a redesign of the way we model interference. For SMT servers, the interference on different shared resources, including private caches, memory ports, as well as integer and floating-point functional units, do not correlate with each other. This insight suggests the necessity of decoupling interference into multiple resource sharing dimensions. In this work, we propose SMiTe, a methodology that enables precise performance prediction for SMT co-location on real-system commodity processors. With a set of Rulers, which are carefully designed software stressors that apply pressure to a multidimensional space of shared resources, we quantify application sensitivity and contentiousness in a decoupled manner. We then establish a regression model to combine the sensitivity and contentiousness in different dimensions to predict performance interference. Using this methodology, we are able to precisely predict the performance interference in SMT co-location with an average error of 2.80% on SPEC CPU2006 and 1.79% on Cloud Suite. Our evaluation shows that SMiTe allows us to improve the utilization of WSCs by up to 42.57% while enforcing an application's QoS requirements.