{"title":"Data Prefetching On The HP PA-8000","authors":"Vatsa Santhanam, Edward H. Gornish, W. Hsu","doi":"10.1145/264107.264208","DOIUrl":null,"url":null,"abstract":"Memory latency is a major issue for many modern microprocessor based systems, including the Hewlett-Packard PA-8000. Due to its fast clock rate and wide issue capability, cache misses in the PA-8000 are very expensive. The PA-8000 combines out-of-order execution with multiple outstanding memory requests to tolerate memory latency; however, this approach has its limitations. In order to substantially reduce much of the memory latency penalty, the PA-8000 uses software-based data cache prefetching. In this paper, we discuss the implementation of the data prefetch generation algorithm in the Hewlett-Packard Precision Architecture (HP-PA) compiler. We present performance results for SPECfp95 on a PA-8000 system that show speedups, due to data prefetching, of up to 100%.","PeriodicalId":405506,"journal":{"name":"Conference Proceedings. The 24th Annual International Symposium on Computer Architecture","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1997-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"90","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference Proceedings. The 24th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/264107.264208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 90
Abstract
Memory latency is a major issue for many modern microprocessor based systems, including the Hewlett-Packard PA-8000. Due to its fast clock rate and wide issue capability, cache misses in the PA-8000 are very expensive. The PA-8000 combines out-of-order execution with multiple outstanding memory requests to tolerate memory latency; however, this approach has its limitations. In order to substantially reduce much of the memory latency penalty, the PA-8000 uses software-based data cache prefetching. In this paper, we discuss the implementation of the data prefetch generation algorithm in the Hewlett-Packard Precision Architecture (HP-PA) compiler. We present performance results for SPECfp95 on a PA-8000 system that show speedups, due to data prefetching, of up to 100%.