Proceedings of the 20th Annual International Symposium on Computer Architecture最新文献

筛选
英文 中文
Design Tradeoffs For Software-managed Tlbs 软件管理Tlbs的设计权衡
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1994-08-01 DOI: 10.1109/ISCA.1993.698543
D. Nagle, R. Uhlig, Timothy J. Stanley, S. Sechrest, T. Mudge, Richard B. Brown
{"title":"Design Tradeoffs For Software-managed Tlbs","authors":"D. Nagle, R. Uhlig, Timothy J. Stanley, S. Sechrest, T. Mudge, Richard B. Brown","doi":"10.1109/ISCA.1993.698543","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698543","url":null,"abstract":"An increasing number of architectures provide virtual memory support through software-managed TLBs. However, software management can impose considerable penalties that are highly dependent on the operating system's structure and its use of virtual memory. This work explores software-managed TLB design tradeoffs and their interaction with a range of monolithic and microkernel operating systems. Through hardware monitoring and simulation, we explore TLB performance for benchmarks running on a MIPS R2000-based workstation running Ultrix, OSF/1, and three versions of Mach 3.0.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1994-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123656049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 142
Hierarchical Performance Modeling With MACS: A Case Study Of The Convex C-240 用MACS进行分层性能建模:凸型C-240的案例研究
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698561
E. Boyd, E. Davidson
{"title":"Hierarchical Performance Modeling With MACS: A Case Study Of The Convex C-240","authors":"E. Boyd, E. Davidson","doi":"10.1109/ISCA.1993.698561","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698561","url":null,"abstract":"The MACS performance model introduced here can be applied to a Machine and Application of interest, the Compiler-generated workload, and the Scheduling of the workload by the compiler. The Ma, MAC, and MACS bounds each fix the named subset of M, A, C, and S while freeing the bound from the constraints imposed by the others. A/X performance measurement is used to measure access-only and execute-only code performance. Such hierarchical performance modeling exposes the gaps between the various bounds, the A/X measurements, and the actual performance, thereby focusing performance optimization at the appropriate levels in a systematic and goal-directed manner. A simple, but detailed, case study of the Convex C-240 vector mini-supercomputer illustrates the method.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116769287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
The Performance Of Cache-coherent Ring-based Multiprocessors 基于缓存相干环的多处理器性能研究
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698567
L. Barroso, M. Dubois
{"title":"The Performance Of Cache-coherent Ring-based Multiprocessors","authors":"L. Barroso, M. Dubois","doi":"10.1109/ISCA.1993.698567","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698567","url":null,"abstract":"Advances in circuit and integration technology are continuously boosting the speed of microprocessors. One of the main challenges presented by such developments is the effective use of powerful microprocessors in shared memory multiprocessor configurations. We believe that the interconnection problem is not solved even for small scale shared memory multiprocessors, since the speed of shared buses is unlikely to keep up with the bandwidth requirements of new microprocessors. In this paper we evaluate the performance of unidirectional slotted ring interconnection for small to medium scale shared memory systems, using a hybrid methodology of analytical models and trace-driven simulations. We evaluate both snooping and directory-based coherence protocols for the ring and compare it to high performance split transaction buses.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133957475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Evaluation Of Release Consistent Software Distributed Shared Memory On Emerging Network Technology 新兴网络技术下发布一致性软件分布式共享内存的评价
S. Dwarkadas, P. Keleher, A. Cox, W. Zwaenepoel
{"title":"Evaluation Of Release Consistent Software Distributed Shared Memory On Emerging Network Technology","authors":"S. Dwarkadas, P. Keleher, A. Cox, W. Zwaenepoel","doi":"10.1145/165123.165150","DOIUrl":"https://doi.org/10.1145/165123.165150","url":null,"abstract":"We evaluate the effect of processor speed, network characteristics, and software overhead on the performance of release-consistent software distributed shared memory. We examine five different protocols for implementing release consistency: eager update, eager invalidate, lazy update, lazy invalidate, and a new protocol called lazy hybrid. This lazy hybrid protocol combines the benefits of both lazy update and lazy invalidate.\u0000Our simulations indicate that with the processors and networks that are becoming available, coarse-grained applications such as Jacobi and TSP perform well, more or less independent of the protocol used. Medium-grained applications, such as Water, can achieve good performance, but the choice of protocol is critical. For sixteen processors, the best protocol, lazy hybrid, performed more than three times better than the worst, the eager update. Fine-grained applications such as Cholesky achieve little speedup regardless of the protocol used because of the frequency of synchronization operations and the high latency involved.\u0000While the use of relaxed memory models, lazy implementations, and multiple-writer protocols has reduced the impact of false sharing, synchronization latency remains a serious problem for software distributed shared memory systems. These results suggest that the future work on software DSMs should concentrate on reducing the amount of synchronization or its effect.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127951813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 118
A Comparison Of Dynamic Branch Predictors That Use Two Levels Of Branch History 使用两层分支历史的动态分支预测器的比较
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698566
Tse-Yu Yeh, Y. Patt
{"title":"A Comparison Of Dynamic Branch Predictors That Use Two Levels Of Branch History","authors":"Tse-Yu Yeh, Y. Patt","doi":"10.1109/ISCA.1993.698566","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698566","url":null,"abstract":"Recent attention to speculative execution as a mechanism for increasing performance of single instruction streams has demanded substantially better branch prediction than what has been previously available. We [1,2] and Pan, So, and Rahmen [4] have both proposed variations of the same aggressive dynamic branch predictor for handling those needs. We call the basic model Two-Level Adaptive Branch Prediction; Pan, So, and Rahmeh call it Correlation Branch Prediction. In this paper, we adopt the terminology of [2] and show that there are really nine variations of the same basic model. We compare the nine variations with respect to the amount of history information kept. We study the effects of different branch history lengths and pattern history table configurations. Finally, we evaluate the cost effectiveness of the nine variations.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126542211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 419
Improving AP1000 Parallel Computer Performance With Message Communication 利用消息通信提高AP1000并行计算机性能
T. Horie, K. Hayashi, T. Shimizu, H. Ishihata
{"title":"Improving AP1000 Parallel Computer Performance With Message Communication","authors":"T. Horie, K. Hayashi, T. Shimizu, H. Ishihata","doi":"10.1145/165123.165168","DOIUrl":"https://doi.org/10.1145/165123.165168","url":null,"abstract":"The performance of message-passing applications depends on cpu speed, communication throughput and latency, and message handling overhead. In this paper we investigate the effect of varying these parameters and applying techniques to reduce message handling overhead on the execution efficiency of ten different applications. Using a message level simulator set up for the architecture of the AP1000, we showed that improving communication performance, especially message handling, improves total performance. If a cpu that is 32 times faster is provided, the total performance increases by less than ten times unless message handling overhead is reduced. Overlapping computation with message reception improves performance significantly. We also discuss how to improve the AP1000 architecture.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125293127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
The Chinese Remainder Theorem And The Prime Memory System 中国剩余定理与素数记忆系统
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698573
Qing-Qiang Gao
{"title":"The Chinese Remainder Theorem And The Prime Memory System","authors":"Qing-Qiang Gao","doi":"10.1109/ISCA.1993.698573","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698573","url":null,"abstract":"As we know, the conflict problem is a very important problem in memory system of super computer, there are two kinds of conflict-free memory system approaches: skewing scheme approach and prime memory system approach. Previously published prime memory approaches are complex or wasting 1/p of the memory space for filling the “holes” [17], where p is the number of memory modules. In this paper, based on Chinese remainder theorem, we present a perfect prime memory system which only need to find the d Mod p without wasting any memory space and without computing the quotient.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131249350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
The Architecture Of A Fault-tolerant Cached RAID Controller 容错缓存RAID控制器的结构
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698547
J. Menon, Jim Cortney
{"title":"The Architecture Of A Fault-tolerant Cached RAID Controller","authors":"J. Menon, Jim Cortney","doi":"10.1109/ISCA.1993.698547","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698547","url":null,"abstract":"RAID-5 arrays need 4 disk accesses to update a data block—2 to read old data and parity, and 2 to write new data and parity. Schemes previously proposed to improve the update performance of such arrays are the Log-Structured File System [10] and the Floating Parity Approach [6]. Here, we consider a third approach, called Fast Write, which eliminates disk time from the host response time to a write, by using a Non-Volatile Cache in the disk array controller. We examine three alternatives for handling Fast Writes and describe a hierarchy of destage algorithms with increasing robustness to failures. These destage algorithms are compared against those that would be used by a disk controller employing mirroring. We show that array controllers require considerably more (2 to 3 times more) bus bandwidth and memory bandwidth than do disk controllers that employ mirroring. So, array controllers that use parity are likely to be more expensive than controllers that do mirroring, though mirroring is more expensive when both controllers and disks are considered.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114008423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 96
Transactional Memory: Architectural Support For Lock-free Data Structures 事务性内存:无锁数据结构的体系结构支持
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698569
Maurice Herlihy, J. E. B. Moss
{"title":"Transactional Memory: Architectural Support For Lock-free Data Structures","authors":"Maurice Herlihy, J. E. B. Moss","doi":"10.1109/ISCA.1993.698569","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698569","url":null,"abstract":"A shared data structure is lock-free if its operations do not require mutual exclusion. If one process is interrupted in the middle of an operation, other processes will not be prevented from operating on that object. In highly concurrent systems, lock-free data structures avoid common problems associated with conventional locking techniques, including priority inversion, convoying, and difficulty of avoiding deadlock. This paper introduces transactional memory, a new multiprocessor architecture intended to make lock-free synchronization as efficient (and easy to use) as conventional techniques based on mutual exclusion. Transactional memory allows programmers to define customized read-modify-write operations that apply to multiple, independently-chosen words of memory. It is implemented by straightforward extensions to any multiprocessor cache-coherence protocol. Simulation results show that transactional memory matches or outperforms the best known locking techniques for simple benchmarks, even in the absence of priority inversion, convoying, and deadlock.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123303955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2560
Adaptive Cache Coherency For Detecting Migratory Shared Data 自适应缓存一致性检测迁移共享数据
Proceedings of the 20th Annual International Symposium on Computer Architecture Pub Date : 1993-05-01 DOI: 10.1109/ISCA.1993.698549
A. Cox, R. Fowler
{"title":"Adaptive Cache Coherency For Detecting Migratory Shared Data","authors":"A. Cox, R. Fowler","doi":"10.1109/ISCA.1993.698549","DOIUrl":"https://doi.org/10.1109/ISCA.1993.698549","url":null,"abstract":"Parallel programs exhibit a small number of distinct data-sharing patterns. A common data-sharing pattern, migratory access, is characterized by exclusive read and write access by one processor at a time to a shared datum. We describe a family of adaptive cache coherency protocols that dynamically identify migratory shared data in order to reduce the cost of moving them. The protocols use a standard memory model and processor-cache interface. They do not require any compile-time or run-time software support. We describe implementations for bus-based multiprocessors and for shared-memory multiprocessors that use directory-based caches. These implementations are simple and would not significantly increase hardware cost. We use trace- and execution-driven simulation to compare the performance of the adaptive protocols to standard write-invalidate protocols. These simulations indicate that, compared to conventional protocols, the use of the adaptive protocol can almost halve the number of inter-node messages on some applications. Since cache coherency traffic represents a larger part of the total communication as cache size increases, the relative benefit of using the adaptive protocol also increases.","PeriodicalId":410022,"journal":{"name":"Proceedings of the 20th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126756251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 194
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信