纠正错误路径

IF 1.4 3区计算机科学 Q4 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Computer Architecture Letters Pub Date : 2025-07-23 DOI:10.1109/LCA.2025.3542809

Bhargav Reddy Godala;Sankara Prasad Ramesh;Krishnam Tibrewala;Chrysanthos Pepi;Gino Chacon;Svilen Kanev;Gilles A. Pokam;Alberto Ros;Daniel A. Jiménez;Paul V. Gratz;David I. August

{"title":"纠正错误路径","authors":"Bhargav Reddy Godala;Sankara Prasad Ramesh;Krishnam Tibrewala;Chrysanthos Pepi;Gino Chacon;Svilen Kanev;Gilles A. Pokam;Alberto Ros;Daniel A. Jiménez;Paul V. Gratz;David I. August","doi":"10.1109/LCA.2025.3542809","DOIUrl":null,"url":null,"abstract":"Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite the strong benefits of trace-driven simulation, it often fails to adequately model the consequences of wrong-path execution because obtaining such traces from real systems is nontrivial. Prior works exclusively consider either pollution or prefetching in the instruction stream/L1-I cache and often ignore the impact on the data stream. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from <inline-formula><tex-math>$-3.05$</tex-math></inline-formula>% to 20.9% versus ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP.","PeriodicalId":51248,"journal":{"name":"IEEE Computer Architecture Letters","volume":"24 2","pages":"221-224"},"PeriodicalIF":1.4000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Correct Wrong Path\",\"authors\":\"Bhargav Reddy Godala;Sankara Prasad Ramesh;Krishnam Tibrewala;Chrysanthos Pepi;Gino Chacon;Svilen Kanev;Gilles A. Pokam;Alberto Ros;Daniel A. Jiménez;Paul V. Gratz;David I. August\",\"doi\":\"10.1109/LCA.2025.3542809\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite the strong benefits of trace-driven simulation, it often fails to adequately model the consequences of wrong-path execution because obtaining such traces from real systems is nontrivial. Prior works exclusively consider either pollution or prefetching in the instruction stream/L1-I cache and often ignore the impact on the data stream. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from <inline-formula><tex-math>$-3.05$</tex-math></inline-formula>% to 20.9% versus ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP.\",\"PeriodicalId\":51248,\"journal\":{\"name\":\"IEEE Computer Architecture Letters\",\"volume\":\"24 2\",\"pages\":\"221-224\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2025-07-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Computer Architecture Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11095315/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Computer Architecture Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11095315/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

现代的OOO cpu具有非常深的管道，具有很大的分支错误预测恢复惩罚。根据推测级别，在错误路径上推测执行的指令可能会显著改变缓存状态。架构师经常在设计探索阶段使用跟踪驱动的仿真模型，这种模型为了速度而牺牲了精度。跟踪驱动的模拟器比执行驱动的模型要快几个数量级，从而减少了探索新的微架构思想所需的数十万小时的模拟时间。尽管跟踪驱动的模拟有很大的好处，但它经常不能充分地模拟错误路径执行的后果，因为从实际系统中获得这样的跟踪是非常重要的。以前的工作只考虑指令流/L1-I缓存中的污染或预取，而经常忽略对数据流的影响。在这里，我们将检查模拟结果中的错误路径执行，并设计一组基础设施，以便在跟踪驱动的模拟器中启用错误路径执行。我们的分析表明，错误路径对指令和数据端的结构都有广泛的影响，与忽略错误路径相比，导致性能变化从-3.05 %到20.9%不等。为了使研究界受益并提高模拟器的准确性，我们开放了我们的跟踪和跟踪实用程序，希望工业界可以提供由其内部模拟器生成的错误路径跟踪，从而在不暴露工业IP的情况下实现学术模拟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Correct Wrong Path

Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite the strong benefits of trace-driven simulation, it often fails to adequately model the consequences of wrong-path execution because obtaining such traces from real systems is nontrivial. Prior works exclusively consider either pollution or prefetching in the instruction stream/L1-I cache and often ignore the impact on the data stream. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from

$-3.05$

% to 20.9% versus ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Computer Architecture Letters COMPUTER SCIENCE, HARDWARE & ARCHITECTURE-

CiteScore

4.60

自引率

4.30%

发文量

期刊介绍： IEEE Computer Architecture Letters is a rigorously peer-reviewed forum for publishing early, high-impact results in the areas of uni- and multiprocessor computer systems, computer architecture, microarchitecture, workload characterization, performance evaluation and simulation techniques, and power-aware computing. Submissions are welcomed on any topic in computer architecture, especially but not limited to: microprocessor and multiprocessor systems, microarchitecture and ILP processors, workload characterization, performance evaluation and simulation techniques, compiler-hardware and operating system-hardware interactions, interconnect architectures, memory and cache systems, power and thermal issues at the architecture level, I/O architectures and techniques, independent validation of previously published results, analysis of unsuccessful techniques, domain-specific processor architectures (e.g., embedded, graphics, network, etc.), real-time and high-availability architectures, reconfigurable systems.