Vidyasagar Nookala, Ying Chen, D. Lilja, S. Sapatnekar
{"title":"Comparing simulation techniques for microarchitecture-aware floorplanning","authors":"Vidyasagar Nookala, Ying Chen, D. Lilja, S. Sapatnekar","doi":"10.1109/ISPASS.2006.1620792","DOIUrl":null,"url":null,"abstract":"Due to the long simulation times of the reference input sets, microarchitects resort to alternative techniques to speed up cycle-accurate simulations. However, the reduction in the runtimes comes with an associated loss of accuracy in replicating the characteristics of the reference sets. In addition, the effect of these inaccuracies on the overall performance can vary across different microarchitecture optimizations or enhancements. In this work, we study and compare two such techniques, reduced input sets and statistical sampling, in the context of microarchitecture-aware floorplanning, a physical design stage, where the objective is to find an IPC-optimal global placement of the blocks of a microprocessor. The variation in the IPC results due the insertion of additional flip-flops on some across-chip wires of the processor that have multicycle delays in nanometer technology nodes. The objective of IPC-aware floorplanning is to minimize the amount of pipelining required by the system buses that are critical in determining the system performance. Our results indicate that, although the two techniques exhibit contrasting behavior in quantifying the criticality of bus latencies, the ensuing floorplanning optimization process results in almost identical performance improvements for both reduced input sets and sampling. The reason behind this is that, for discrete optimization problems such as IPC-aware floorplanning, a reasonably accurate relative ordering of performance bottlenecks is sufficient, absolute accuracy is not necessary.","PeriodicalId":369192,"journal":{"name":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Symposium on Performance Analysis of Systems and Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2006.1620792","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Due to the long simulation times of the reference input sets, microarchitects resort to alternative techniques to speed up cycle-accurate simulations. However, the reduction in the runtimes comes with an associated loss of accuracy in replicating the characteristics of the reference sets. In addition, the effect of these inaccuracies on the overall performance can vary across different microarchitecture optimizations or enhancements. In this work, we study and compare two such techniques, reduced input sets and statistical sampling, in the context of microarchitecture-aware floorplanning, a physical design stage, where the objective is to find an IPC-optimal global placement of the blocks of a microprocessor. The variation in the IPC results due the insertion of additional flip-flops on some across-chip wires of the processor that have multicycle delays in nanometer technology nodes. The objective of IPC-aware floorplanning is to minimize the amount of pipelining required by the system buses that are critical in determining the system performance. Our results indicate that, although the two techniques exhibit contrasting behavior in quantifying the criticality of bus latencies, the ensuing floorplanning optimization process results in almost identical performance improvements for both reduced input sets and sampling. The reason behind this is that, for discrete optimization problems such as IPC-aware floorplanning, a reasonably accurate relative ordering of performance bottlenecks is sufficient, absolute accuracy is not necessary.