通过减少分支惩罚在超标量处理器中实现分支延迟

2010 IEEE 2nd International Advance Computing Conference (IACC) Pub Date : 2010-03-01 DOI:10.1109/IADCC.2010.5423045

Rubina Khanna, S. Verma, R. Biswas, J. Singh

{"title":"通过减少分支惩罚在超标量处理器中实现分支延迟","authors":"Rubina Khanna, S. Verma, R. Biswas, J. Singh","doi":"10.1109/IADCC.2010.5423045","DOIUrl":null,"url":null,"abstract":"Branch prediction is crucial to maintaining high performance in modern Superscalar processor. Today's Superscalar processors achieve high performance by executing multiple independent instructions in parallel. One of the most impedement to the performance of wide-issue superscalar processor is the presence of conditional branches. Conditional branches can occur as frequently as one in every 5 or 6 instructions, leading to heavy misprediction penalties in superscalar architectures. Ideal speed-up in superscalar processor is seldom achieved due to stalls and breaks in the execution stream. These interrupts are caused by data and control hazards which deteroits the superscalar processor performance. Branch target buffer (BTB) can reduces the performance penalty of branches in superscalar processor by predicting the path of the branch and caching information used by the branch. No stalls will be encountered if the branch entry is found in the BTB and prediction is correct. Otherwise, the penalty will be of atleast ‘2’ cycles. This paper proposes an algorithm for superscalar processor based on changing the BTB structure to eliminate the misprediction penalty. It also highlights a problem in the previous BTB algorithm (nested branches problem) and proposes a solution to it.","PeriodicalId":249763,"journal":{"name":"2010 IEEE 2nd International Advance Computing Conference (IACC)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Implementation of branch delay in Superscalar processors by reducing branch penalties\",\"authors\":\"Rubina Khanna, S. Verma, R. Biswas, J. Singh\",\"doi\":\"10.1109/IADCC.2010.5423045\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Branch prediction is crucial to maintaining high performance in modern Superscalar processor. Today's Superscalar processors achieve high performance by executing multiple independent instructions in parallel. One of the most impedement to the performance of wide-issue superscalar processor is the presence of conditional branches. Conditional branches can occur as frequently as one in every 5 or 6 instructions, leading to heavy misprediction penalties in superscalar architectures. Ideal speed-up in superscalar processor is seldom achieved due to stalls and breaks in the execution stream. These interrupts are caused by data and control hazards which deteroits the superscalar processor performance. Branch target buffer (BTB) can reduces the performance penalty of branches in superscalar processor by predicting the path of the branch and caching information used by the branch. No stalls will be encountered if the branch entry is found in the BTB and prediction is correct. Otherwise, the penalty will be of atleast ‘2’ cycles. This paper proposes an algorithm for superscalar processor based on changing the BTB structure to eliminate the misprediction penalty. It also highlights a problem in the previous BTB algorithm (nested branches problem) and proposes a solution to it.\",\"PeriodicalId\":249763,\"journal\":{\"name\":\"2010 IEEE 2nd International Advance Computing Conference (IACC)\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 2nd International Advance Computing Conference (IACC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IADCC.2010.5423045\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 2nd International Advance Computing Conference (IACC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IADCC.2010.5423045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

分支预测是现代超标量处理器保持高性能的关键。今天的超标量处理器通过并行执行多个独立指令来实现高性能。影响大规模超标量处理器性能的最大障碍之一是条件分支的存在。条件分支的发生频率可能高达每5或6个指令中就有一个，这在超标量架构中导致严重的错误预测惩罚。在超标量处理器中，由于执行流中的停顿和中断，很难实现理想的加速。这些中断是由数据和控制危害引起的，这些危害影响了超标量处理器的性能。分支目标缓冲区(BTB)通过预测分支的路径和缓存分支所使用的信息，可以减少标量处理器中分支的性能损失。如果在BTB中找到分支条目并且预测正确，则不会遇到拖延。否则，将被罚至少“2”次。本文提出了一种基于改变BTB结构的超标量处理器算法来消除错误预测惩罚。强调了之前BTB算法中存在的问题(嵌套分支问题)，并提出了解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Implementation of branch delay in Superscalar processors by reducing branch penalties

Branch prediction is crucial to maintaining high performance in modern Superscalar processor. Today's Superscalar processors achieve high performance by executing multiple independent instructions in parallel. One of the most impedement to the performance of wide-issue superscalar processor is the presence of conditional branches. Conditional branches can occur as frequently as one in every 5 or 6 instructions, leading to heavy misprediction penalties in superscalar architectures. Ideal speed-up in superscalar processor is seldom achieved due to stalls and breaks in the execution stream. These interrupts are caused by data and control hazards which deteroits the superscalar processor performance. Branch target buffer (BTB) can reduces the performance penalty of branches in superscalar processor by predicting the path of the branch and caching information used by the branch. No stalls will be encountered if the branch entry is found in the BTB and prediction is correct. Otherwise, the penalty will be of atleast ‘2’ cycles. This paper proposes an algorithm for superscalar processor based on changing the BTB structure to eliminate the misprediction penalty. It also highlights a problem in the previous BTB algorithm (nested branches problem) and proposes a solution to it.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 IEEE 2nd International Advance Computing Conference (IACC)

自引率

0.00%

发文量