{"title":"Improving CC-NUMA performance using Instruction-based Prediction","authors":"S. Kaxiras, J. Goodman","doi":"10.1109/HPCA.1999.744359","DOIUrl":null,"url":null,"abstract":"We propose Instruction-based Prediction as a means to optimize directory based cache coherent NUMA shared memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared memory. Typically, in this environment, prediction is based on data block access history (address based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address based prediction. To show the potential of instruction-based prediction we propose and evaluate three different optimizations: i) a migratory sharing optimization, ii) a wide sharing optimization, and iii) a producer consumer optimization based on speculative execution. With execution driven simulation and a set of nine benchmarks we show that i) for the first two optimizations, instruction-based prediction, using few predictor entries per node, outpaces address based schemes, and (ii) for the producer consumer optimization which uses speculative execution, low mis speculation rates show promise for performance improvements.","PeriodicalId":287867,"journal":{"name":"Proceedings Fifth International Symposium on High-Performance Computer Architecture","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"85","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Fifth International Symposium on High-Performance Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.1999.744359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 85
Abstract
We propose Instruction-based Prediction as a means to optimize directory based cache coherent NUMA shared memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared memory. Typically, in this environment, prediction is based on data block access history (address based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address based prediction. To show the potential of instruction-based prediction we propose and evaluate three different optimizations: i) a migratory sharing optimization, ii) a wide sharing optimization, and iii) a producer consumer optimization based on speculative execution. With execution driven simulation and a set of nine benchmarks we show that i) for the first two optimizations, instruction-based prediction, using few predictor entries per node, outpaces address based schemes, and (ii) for the producer consumer optimization which uses speculative execution, low mis speculation rates show promise for performance improvements.