Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000最新文献_第4页

Reducing ownership overhead for load-store sequences in cache-coherent multiprocessors 减少缓存一致多处理器中负载存储序列的所有权开销

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846053

J. Nilsson, F. Dahlgren

{"title":"Reducing ownership overhead for load-store sequences in cache-coherent multiprocessors","authors":"J. Nilsson, F. Dahlgren","doi":"10.1109/IPDPS.2000.846053","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846053","url":null,"abstract":"Parallel programs that modify shared data in a cache-coherent multiprocessor with a write-invalidate coherence protocol create ownership overhead in the form of ownership acquisitions at writes to shared data. This can have a significant impact on performance in a cache-coherent non-uniform memory architecture (NUMA) multiprocessor. By combining a read-request and an ownership acquisition, the write latency and network traffic can potentially be reduced. In this paper we propose a new hardware-based approach far performing this optimization by targeting load-store sequences, which we show is a super-set of migrator sharing. A load-store sequence consists of a global read request followed by a global write action to the same memory, location from the same processor without any intervening access to the same block from any other processor. We use detailed simulation with four benchmark programs including one on-line transaction processing (OLTP) workload and operating system execution to examine the effectiveness of the proposed technique. The results show that the technique is able to reduce write-related latency and network traffic more than previous hardware-based techniques, up to twice as much.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122294626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Micro-architectures of high performance, multi-user system area network interface cards 微体系结构的高性能、多用户系统局域网接口卡

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.845959

B. S. Ang, Derek Chiou, L. Rudolph, Arvind

引用次数: 2

Load balancing strategies for dense linear algebra kernels on heterogeneous two-dimensional grids 非均匀二维网格上密集线性代数核的负载均衡策略

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846065

Olivier Beaumont, Vincent Boudet, F. Rastello, Y. Robert

引用次数: 8

An optimal parallel algorithm for computing moments on arrays with reconfigurable optical buses 可重构光总线阵列矩计算的最优并行算法

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846059

C. Wu, S. Horng, Jinn-Fu Lin, Horng-Ren Tsai, Tsrong-Lay Lin

引用次数: 2

Exploration of the spatial locality on emerging applications and the consequences for cache performance 探索新兴应用程序的空间局部性及其对缓存性能的影响

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.845978

Martin Kämpe, F. Dahlgren

引用次数: 13

Power-aware localized routing in wireless networks 无线网络中功率感知的局部路由

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846008

I. Stojmenovic, Xu Lin

引用次数: 934

De Bruijn isomorphisms and free space optical networks 德布鲁因同构与自由空间光网络

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846063

D. Coudert, Afonso Ferreira, S. Pérennes

引用次数: 6

A quantitative assessment of thread-level speculation techniques 线程级推测技术的定量评估

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846040

P. Marcuello, Antonio González

引用次数: 47

Image layer decomposition for distributed real-time rendering on clusters 面向集群的分布式实时渲染的图像层分解

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846015

Thu D. Nguyen, J. Zahorjan

引用次数: 3

Monotonic counters: a new mechanism for thread synchronization 单调计数器:线程同步的新机制

Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000 Pub Date : 2000-05-01 DOI: 10.1109/IPDPS.2000.846037

J. Thornley, K. Chandy

{"title":"Monotonic counters: a new mechanism for thread synchronization","authors":"J. Thornley, K. Chandy","doi":"10.1109/IPDPS.2000.846037","DOIUrl":"https://doi.org/10.1109/IPDPS.2000.846037","url":null,"abstract":"Only a handful of fundamental mechanisms for synchronizing the access of concurrent threads to shared memory are widely implemented and used. These include locks, condition variables, semaphores, barriers, and monitors. In this paper, we introduce a new synchronization mechanism-monotonic counters-and make a case for its addition to this group. Unlike most other synchronization mechanisms, monotonic counters were designed primarily for multiprocessing, rather than for systems programming. Counters have a very simple definition: a counter object has a nonnegative value, an Increment operation, and a Check operation. Increment atomically increases the counter, and Check suspends until the counter reaches a specified level. We demonstrate that many practical thread synchronization patterns can be expressed more elegantly using counters than with other synchronization mechanisms. Of particular importance, the monotonicity of counters can be used to guarantee deterministic synchronization and the equivalence of multithreaded and sequential execution. In terms of implementation, counters are distinguished from traditional synchronization mechanisms, in that they have a dynamically varying number of thread suspension queues. We give several examples of multithreaded programs that use counter synchronization, and give an implementation of counters on top of locks and condition variables.","PeriodicalId":206541,"journal":{"name":"Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130830058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0