{"title":"寄存器更新单元大小对多路径执行的影响","authors":"Chao-Chin Wu, Kuan-Chou Lai, En-Hao Liu, Jin-Yuan Chen","doi":"10.1109/PACRIM.2007.4313190","DOIUrl":null,"url":null,"abstract":"Branch prediction is a key mechanism to boost the system performance of a superscalar processor. Though the prediction accuracy rate becomes higher and higher, the mispredicitons still lead to significant performance losses in a wide-issue deep-pipelined superscalar. To address the problem, the technique of multipath execution has been proposed previously, which is capable of executing both paths whenever a lower-confidence conditional branch is encountered. However, because the instructions from different paths share a single register update unit (RUU), they are interleaved in the RUU. In consequence, when a conditional branch is resolved and the instructions on the wrong paths are squashed, all the entries in the resulting holes cannot be reused until they are reclaimed at the commit stage. Since the RUU size is crucial to the performance, it is interesting to know how much can we speedup the performance if the squashed RUU entries can be reused as soon as possible. We have proposed a simple mechanism with very limited hardware resources to achieve this goal. Finally the preliminary simulation results are presented.","PeriodicalId":395921,"journal":{"name":"2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The Impact of the Register Update Unit Size on Multipath Execution\",\"authors\":\"Chao-Chin Wu, Kuan-Chou Lai, En-Hao Liu, Jin-Yuan Chen\",\"doi\":\"10.1109/PACRIM.2007.4313190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Branch prediction is a key mechanism to boost the system performance of a superscalar processor. Though the prediction accuracy rate becomes higher and higher, the mispredicitons still lead to significant performance losses in a wide-issue deep-pipelined superscalar. To address the problem, the technique of multipath execution has been proposed previously, which is capable of executing both paths whenever a lower-confidence conditional branch is encountered. However, because the instructions from different paths share a single register update unit (RUU), they are interleaved in the RUU. In consequence, when a conditional branch is resolved and the instructions on the wrong paths are squashed, all the entries in the resulting holes cannot be reused until they are reclaimed at the commit stage. Since the RUU size is crucial to the performance, it is interesting to know how much can we speedup the performance if the squashed RUU entries can be reused as soon as possible. We have proposed a simple mechanism with very limited hardware resources to achieve this goal. Finally the preliminary simulation results are presented.\",\"PeriodicalId\":395921,\"journal\":{\"name\":\"2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PACRIM.2007.4313190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACRIM.2007.4313190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Impact of the Register Update Unit Size on Multipath Execution
Branch prediction is a key mechanism to boost the system performance of a superscalar processor. Though the prediction accuracy rate becomes higher and higher, the mispredicitons still lead to significant performance losses in a wide-issue deep-pipelined superscalar. To address the problem, the technique of multipath execution has been proposed previously, which is capable of executing both paths whenever a lower-confidence conditional branch is encountered. However, because the instructions from different paths share a single register update unit (RUU), they are interleaved in the RUU. In consequence, when a conditional branch is resolved and the instructions on the wrong paths are squashed, all the entries in the resulting holes cannot be reused until they are reclaimed at the commit stage. Since the RUU size is crucial to the performance, it is interesting to know how much can we speedup the performance if the squashed RUU entries can be reused as soon as possible. We have proposed a simple mechanism with very limited hardware resources to achieve this goal. Finally the preliminary simulation results are presented.