The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.最新文献_第2页

Caches and hash trees for efficient memory integrity verification 缓存和哈希树用于有效的内存完整性验证

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183547

B. Gassend, G. Suh, Dwaine E. Clarke, Marten van Dijk, S. Devadas

引用次数: 282

Tradeoffs in buffering memory state for thread-level speculation in multiprocessors 在多处理器中为线程级推测缓冲内存状态的权衡

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183537

M. Garzarán, Milos Prvulović, J. Llabería, V. Viñals, Lawrence Rauchwerger, J. Torrellas

{"title":"Tradeoffs in buffering memory state for thread-level speculation in multiprocessors","authors":"M. Garzarán, Milos Prvulović, J. Llabería, V. Viñals, Lawrence Rauchwerger, J. Torrellas","doi":"10.1109/HPCA.2003.1183537","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183537","url":null,"abstract":"Thread-level speculation provides architectural support to aggressively run hard-to-analyze code in parallel. As speculative tasks run concurrently, they generate unsafe or speculative memory state that needs to be separately buffered and managed in the presence of distributed caches and buffers. Such state may contain multiple versions of the same variable. In this paper, we introduce a novel taxonomy of approaches to buffering and managing multi-version speculative memory state in multiprocessors. We also present a detailed complexity-benefit tradeoff analysis of the different approaches. Finally, we use numerical applications to evaluate the performance of the approaches under a single architectural framework. Our key insights are that support for buffering the state of multiple speculative tasks and versions per processor is more complexity-effective than support for merging the state of tasks with main memory lazily. Moreover, both supports can be gainfully combined and, in large machines, their effect is nearly fully additive. Finally, the more complex support for future state in main memory can boost performance when buffers are under pressure, but hurts performance when squashes are frequent.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132572278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 55

A statistically rigorous approach for improving simulation methodology 改进模拟方法的统计严谨方法

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183546

J. Yi, D. Lilja, D. Hawkins

{"title":"A statistically rigorous approach for improving simulation methodology","authors":"J. Yi, D. Lilja, D. Hawkins","doi":"10.1109/HPCA.2003.1183546","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183546","url":null,"abstract":"Due to cost, time, and flexibility constraints, simulators are often used to explore the design space when developing new processor architectures, as well as when evaluating the performance of new processor enhancements. However, despite this dependence on simulators, statistically rigorous simulation methodologies are not typically used in computer architecture research. A formal methodology can provide a sound basis for drawing conclusions gathered from simulation results by adding statistical rigor, and consequently, can increase confidence in the simulation results. This paper demonstrates the application of a rigorous statistical technique to the setup and analysis phases of the simulation process. Specifically, we apply a Plackett and Burman design to: (1) identify key processor parameters; (2) classify benchmarks based on how they affect the processor; and (3) analyze the effect of processor performance enhancements. Our technique expands on previous work by applying a statistical method to improve the simulation methodology instead of applying a statistical model to estimate the performance of the processor.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"83 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131349479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 171

Just say no: benefits of early cache miss determination 直接说不:早期缓存缺失判断的好处

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183548

G. Memik, Glenn D. Reinman, W. Mangione-Smith

{"title":"Just say no: benefits of early cache miss determination","authors":"G. Memik, Glenn D. Reinman, W. Mangione-Smith","doi":"10.1109/HPCA.2003.1183548","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183548","url":null,"abstract":"As the performance gap between the processor cores and the memory subsystem increases, designers are forced to develop new latency hiding techniques. Arguably, the most common technique is to utilize multi-level caches. Each new generation of processors is equipped with higher levels of memory hierarchy with increasing sizes at each level. In this paper, we propose 5 different techniques that will reduce the data access times and power consumption in processors with multi-level caches. Using the information about the blocks placed into and replaced from the caches, the techniques quickly determine whether an access at any cache level will be a miss. The accesses that are identified to miss are aborted. The structures used to recognize misses are much smaller than the cache structures. Consequently the data access times and power consumption are reduced. Using the SimpleScalar simulator, we study the performance of these techniques for a processor with 5 cache levels. The best technique is able to abort 53.1% of the misses on average in SPEC2000 applications. Using these techniques, the execution time of the applications is reduced by up to 12.4% (5.4% on average), and the power consumption of the caches is reduced by as much as 11.6% (3.8% on average).","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116223556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 54

Dynamic data dependence tracking and its application to branch prediction 动态数据依赖跟踪及其在分支预测中的应用

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183525

Lei Chen, S. Dropsho, D. Albonesi

{"title":"Dynamic data dependence tracking and its application to branch prediction","authors":"Lei Chen, S. Dropsho, D. Albonesi","doi":"10.1109/HPCA.2003.1183525","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183525","url":null,"abstract":"To continue to improve processor performance, microarchitects seek to increase the effective instruction level parallelism (ILP) that can be exploited in applications. A fundamental limit to improving ILP is data dependences among instructions. If data dependence information is available at run-time, there are many uses to improve ILP. Prior published examples include decoupled branch execution architectures and critical instruction detection. In this paper, we describe an efficient hardware mechanism to dynamically track the data dependence chains of the instructions in the pipeline. This information is available on a cycle-by-cycle basis to the microengine for optimizing its performance. We then use this design in a new value-based branch prediction design using available register value information (ARVI). From the use of data dependence information, the ARVI branch predictor has better prediction accuracy over a comparably sized hybrid branch predictor With ARVI used as the second-level branch predictor the improved prediction accuracy results in a 12.6% performance improvement on average across the SPEC95 integer benchmark suite.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122261545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 33

Performance enhancement techniques for InfiniBand/sup TM/ Architecture InfiniBand/sup TM/架构的性能增强技术

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183543

Eun Jung Kim, K. H. Yum, C. Das, Mazin S. Yousif, J. Duato

{"title":"Performance enhancement techniques for InfiniBand/sup TM/ Architecture","authors":"Eun Jung Kim, K. H. Yum, C. Das, Mazin S. Yousif, J. Duato","doi":"10.1109/HPCA.2003.1183543","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183543","url":null,"abstract":"The InfiniBand/sup TM/ Architecture (IBA) is envisioned to be the default communication fabric for future system area networks (SAN). However, the released IBA specification outlines only higher level functionalities, leaving it open for exploring various design alternatives. In this paper we investigate four co-related techniques to provide high and predictable performance in IBA. These are: (i) using the shortest path first (SPF) algorithm for deterministic packet routing; (ii) developing a multipath routing mechanism for minimizing congestion; (iii) developing a selective packet dropping scheme to handle deadlock and congestion; and (iv) providing multicasting support for customized applications. These designs are evaluated using an integrated workload on a versatile IBA simulation testbed. Simulation results indicate that the SPF routing, multipath routing, packet dropping, and multicasting schemes are quite effective in delivering high and assured performance in clusters. One of the major contributions of this research is the IBA simulation testbed, which is an essential tool to evaluate various design tradeoffs.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115587378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 26

Dynamic voltage scaling with links for power optimization of interconnection networks 用于互联网络功率优化的链路动态电压缩放

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183527

L. Shang, L. Peh, N. Jha

{"title":"Dynamic voltage scaling with links for power optimization of interconnection networks","authors":"L. Shang, L. Peh, N. Jha","doi":"10.1109/HPCA.2003.1183527","DOIUrl":"https://doi.org/10.1109/HPCA.2003.1183527","url":null,"abstract":"Originally developed to connect processors and memories in multicomputers, prior research and design of interconnection networks have focused largely on performance. As these networks get deployed in a wide range of new applications, where power is becoming a key design constraint, we need to seriously consider power efficiency in designing interconnection networks. As the demand for network bandwidth increases, communication links, already a significant consumer of power now, will take up an ever larger portion of total system power budget. In this paper we motivate the use of dynamic voltage scaling (DVS) for links, where the frequency and voltage of links are dynamically adjusted to minimize power consumption. We propose a history-based DVS policy that judiciously adjusts link frequencies and voltages based on past utilization. Our approach realizes up to 6.3/spl times/ power savings (4.6/spl times/ on average). This is accompanied by a moderate impact on performance (15.2% increase in average latency before network saturation and 2.5% reduction in throughput.) To the best of our knowledge, this is the first study that targets dynamic power optimization of interconnection networks.","PeriodicalId":150992,"journal":{"name":"The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings.","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114418159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 490

Exploring the VLSI scalability of stream processors 探索流处理器的VLSI可扩展性

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183534

Brucek Khailany, W. Dally, S. Rixner, U. Kapasi, John Douglas Owens, Brian Towles

引用次数: 74

Deterministic clock gating for microprocessor power reduction 用于微处理器功耗降低的确定性时钟门控

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183529

Hai Helen Li, S. Bhunia, Yiran Chen, T. N. Vijaykumar, K. Roy

引用次数: 110

Hierarchical backoff locks for nonuniform communication architectures 用于非统一通信体系结构的分层回退锁

The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. Pub Date : 2003-02-08 DOI: 10.1109/HPCA.2003.1183542

Z. Radovic, Erik Hagersten

引用次数: 73