M. Pilla, Amarildo T. da Costa, F. França, B. Childers, M. Soffa
{"title":"深度流水线处理器上推测性跟踪重用的限制","authors":"M. Pilla, Amarildo T. da Costa, F. França, B. Childers, M. Soffa","doi":"10.1109/CAHPC.2003.1250319","DOIUrl":null,"url":null,"abstract":"Trace reuse improves the performance of processors by skipping the execution of sequences of redundant instructions. However, many reusable traces do not have all of their inputs ready by the time the reuse test is done. For these cases, we developed a new technique called reuse through speculation on traces (RST), where trace inputs may be predicted. We study the limits of RST for modern processors with deep pipelines, as well as the effects of constraining resources on performance. We show that our approach reuses more traces than the nonspeculative trace reuse technique, with speedups of 43% over a nonspeculative trace reuse and 57% when memory accesses are reused.","PeriodicalId":433002,"journal":{"name":"Proceedings. 15th Symposium on Computer Architecture and High Performance Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"The limits of speculative trace reuse on deeply pipelined processors\",\"authors\":\"M. Pilla, Amarildo T. da Costa, F. França, B. Childers, M. Soffa\",\"doi\":\"10.1109/CAHPC.2003.1250319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Trace reuse improves the performance of processors by skipping the execution of sequences of redundant instructions. However, many reusable traces do not have all of their inputs ready by the time the reuse test is done. For these cases, we developed a new technique called reuse through speculation on traces (RST), where trace inputs may be predicted. We study the limits of RST for modern processors with deep pipelines, as well as the effects of constraining resources on performance. We show that our approach reuses more traces than the nonspeculative trace reuse technique, with speedups of 43% over a nonspeculative trace reuse and 57% when memory accesses are reused.\",\"PeriodicalId\":433002,\"journal\":{\"name\":\"Proceedings. 15th Symposium on Computer Architecture and High Performance Computing\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-11-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. 15th Symposium on Computer Architecture and High Performance Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAHPC.2003.1250319\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. 15th Symposium on Computer Architecture and High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAHPC.2003.1250319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The limits of speculative trace reuse on deeply pipelined processors
Trace reuse improves the performance of processors by skipping the execution of sequences of redundant instructions. However, many reusable traces do not have all of their inputs ready by the time the reuse test is done. For these cases, we developed a new technique called reuse through speculation on traces (RST), where trace inputs may be predicted. We study the limits of RST for modern processors with deep pipelines, as well as the effects of constraining resources on performance. We show that our approach reuses more traces than the nonspeculative trace reuse technique, with speedups of 43% over a nonspeculative trace reuse and 57% when memory accesses are reused.