Proceedings of the 20th Annual International Symposium on Computer Architecture最新文献_第2页

Evaluation Of Mechanisms For Fine-grained Parallel Programs In The J-machine And The Cm-5 J-machine和Cm-5中细粒度并行程序的机制评估

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1145/165123.165165

Ellen Spertus, S. Goldstein, K. Schauser, T. V. Eicken, D. Culler, W. Dally

引用次数: 56

Architectural Requirements Of Parallel Scientific Applications With Explicit Communication 具有显式通信的并行科学应用的体系结构需求

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1145/165123.165124

R. Cypher, Alex Ho, S. Konstantinidou, P. Messina

引用次数: 158

Acase For Two-way Skewed-associative Caches 用于双向倾斜关联缓存的案例

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698558

André Seznec

{"title":"Acase For Two-way Skewed-associative Caches","authors":"André Seznec","doi":"10.1109/ISCA.1993.698558","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698558","url":null,"abstract":"We introduce a new organization for multi-bank cach es: the skewed-associative cache. A two-way skewed-associative cache has the same hardware complexity as a two-way set-associative cache, yet simulations show that it typically exhibits the same hit ratio as a four-way set associative cache with the same size. Then skewed-associative caches must be preferred to set-associative caches. Until the three last years external caches were used and their size could be relatively large. Previous studies have showed that, for cache sizes larger than 64 Kbyt es, direct-mapped caches exhibit hit ratios nearly as good as set-associative caches at a lower hardware cost. Moreover, the cache hit time on a direct-mapped cache may be quite smaller than the cache hit time on a set-associative cache, because optimistic use of data jlowing out from the cache is quite natural. But now, microprocessors are designed with small on-chip caches. Performance of low-end microprocessor systems highly depends on cache behavior. Simulations show that using some associativity in on-chip caches allows to boost the performance of these lowend systems. When considering optimistic use of data (or instruction) jlowing out from the cache, the cache hit time of a two-way skewed-associative (or setassociative) cache is very close to the cache hit time of a direct-mapped cache. Therefore two-way skewed associative caches represent the best tradeoff for today microprocessors with on-chip caches whose sizes are in the range of 4-8K bytes.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124761879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 278

Register Connection: A New Approach To Adding Registers Into Instruction Set Architectures 寄存器连接:在指令集体系结构中增加寄存器的一种新方法

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698565

T. Kiyohara, S. Mahlke, William Y. Chen, Roger A. Bringmann, R. Hank, S. Anik, Wen-mei W. Hwu

{"title":"Register Connection: A New Approach To Adding Registers Into Instruction Set Architectures","authors":"T. Kiyohara, S. Mahlke, William Y. Chen, Roger A. Bringmann, R. Hank, S. Anik, Wen-mei W. Hwu","doi":"10.1109/ISCA.1993.698565","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698565","url":null,"abstract":"Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of registers, this increased register requirement can be a factor that limits the effectivess of the compiler. In this paper, we introduce a new architectural method for adding a set of extended registers into an architecture. Using a novel concept of connection, this method allows the data stored in the extended registers to be accessed by instructions that apparently reference core registers. Furthermore, we address the technical issues involved in applying the new method to an architecture: instruction set extension, procedure call convention, context switching considerations, upward compatibility, efficient implementation, compiler support, and performance. Experimental results based on a prototype compiler and execution driven simulation show that the proposed method can significantly improve the performance of superscalar processors with a small or moderate number of registers.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121612104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 39

Architectural Support For Translation Table Management In Large Address Space Machines 大型地址空间机中转换表管理的体系结构支持

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698544

Jerome C. Huck, Jim Hays

引用次数: 144

Parity Logging Overcoming The Small Write Problem In Redundant Disk Arrays 奇偶记录克服冗余磁盘阵列中的小写问题

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1145/165123.165143

Daniel Stodolsky, G. Gibson, M. Holland

{"title":"Parity Logging Overcoming The Small Write Problem In Redundant Disk Arrays","authors":"Daniel Stodolsky, G. Gibson, M. Holland","doi":"10.1145/165123.165143","DOIUrl":"https://doi.org/10.1145/165123.165143","url":null,"abstract":"Parity encoded redundant disk arrays provide highly reliable, cost effective secondary storage with high performance for read accesses and large write accesses. Their performance on small writes, however, is much worse than mirrored disks—the traditional, highly reliable, but expensive organization for secondary storage. Unfortunately, small writes are a substantial portion of the I/O workload of many important, demanding applications such as on-line transaction processing. This paper presents parity logging, a novel solution to the small write problem for redundant disk arrays. Parity logging applies journalling techniques to substantially reduce the cost of small writes. We provide a detailed analysis of parity logging and competing schemes—mirroring, floating storage, and RAID level 5— and verify these models by simulation. Parity logging provides performance competitive with mirroring, the best of the alternative single failure tolerating disk array organizations. However, its overhead cost is close to the minimum offered by RAID level 5. Finally, parity logging can exploit data caching much more effectively than all three alternative approaches.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127824114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 164

Mechanisms For Cooperative Shared Memory 协作共享内存机制

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698554

D. Wood, S. Chandra, B. Falsafi, M. Hill, J. Larus, A. Lebeck, James C. Lewis, Shubhendu S. Mukherjee, Subbarao Palacharla, S. Reinhardt

{"title":"Mechanisms For Cooperative Shared Memory","authors":"D. Wood, S. Chandra, B. Falsafi, M. Hill, J. Larus, A. Lebeck, James C. Lewis, Shubhendu S. Mukherjee, Subbarao Palacharla, S. Reinhardt","doi":"10.1109/ISCA.1993.698554","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698554","url":null,"abstract":"This paper explores the complexity of implementing directory protocols by examining their mechanisms primitive operations on directories, caches, and network interfaces. We compare the following protocols: Dir1B, Dir4B, Dir4NB, DirnNB[2], Dir1SW[9] and an improved version of Dir1SW (Dir1SW+). The comparison shows that the mechanisms and mechanism sequencing of Dir1SW and Dir1SW+ are simpler than those for other protocols. We also compare protocol performance by running eight benchmarks on 32 processor systems. Simulations show that Dir1SW+s performance is comparable to more complex directory protocols. The significant disparity in hardware complexity and the small difference in performance argue that Dir1SW+ may be a more effective use of resources. The small performance difference is attributable to two factors: the low degree of sharing in the benchmarks and Check- In/Check-Out (CICO) directives [9]. ","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122341172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 56

A Comparison Of Adaptive Wormhole Routing Algorithms 自适应虫洞路由算法的比较

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698575

R. Boppana, S. Chalasani

{"title":"A Comparison Of Adaptive Wormhole Routing Algorithms","authors":"R. Boppana, S. Chalasani","doi":"10.1109/ISCA.1993.698575","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698575","url":null,"abstract":"Improvement of message latency and network utilization in torus interconnection networks by increasing adaptivity in wormhole routing algorithms is studied. A recently proposed partially adaptive algorithm and four new fully-adaptive routing algorithms are compared with the well-known e-cube algorithm for uniform, hotspot, and local traffic patterns. Our simulations indicate that the partially adaptive north-last algorithm, which causes unbalanced traffic in the network, performs worse than the nonadaptive e-cube routing algorithm for all three traffic patterns. Another result of our study is that the performance does not necessarily improve with full-adaptivity. In particular, a commonly discussed fully-adaptive routing algorithm, which uses 2n virtual channels per physical channel of a k-ary n-cube, performs worse than e-cube for uniform and hotspot traffic patterns. The other three fully-adaptive algorithms, which give priority to messages based on distances traveled, perform much better than the e-cube and partially-adaptive algorithms for all three traffic patterns. One of the conclusions of this study is that adaptivity, full or partial, is not necessarily a benefit in wormhole routing.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125610289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 184

The TickerTAIP Parallel RAID Architecture tickertip并行RAID架构

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698545

P. Cao, S. Lim, S. Venkataraman, J. Wilkes

引用次数: 147

The Detection And Elimination Of Useless Misses In Multiprocessors 多处理器中无用缺失的检测与消除

Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698548

M. Dubois, J. Skeppstedt, L. Ricciulli, Krishnan Ramamurthy, P. Stenström

引用次数: 126