高效屏障同步技术及其在大规模共享内存多处理器中的应用

K. Ghose, D. Cheng
{"title":"高效屏障同步技术及其在大规模共享内存多处理器中的应用","authors":"K. Ghose, D. Cheng","doi":"10.1006/JMCA.1994.1012","DOIUrl":null,"url":null,"abstract":"Abstract Shared memory multiprocessors offer a relatively simple programming model and are suitable for a wide variety of parallel applications. Unfortunately, shared memory multiprocessors are not as scalable as distributed memory multiprocessors owing to memory and switch contentions that can result in the formation of hot spots. Spinning on synchronization variables appears to be the main culprit behind the formation of hot spots, affecting system scalability adversely. The purpose of this paper is to address the issue of performing barrier synchronization efficiently in large-scale shared memory multiprocessors. We propose a very simple design for a hardware barrier synchronizer that has the characteristics of what one would call an ideal barrier synchronizer. In particular, the proposed barrier synchronizer allows fast barrier synchronization without injecting spin traffic to create hot spots and can be reused as soon as it has completed a barrier synchronization. We also show that by augmenting this barrier synchronizer with a few gates, it can be used to perform dynamic barrier synchronization, where neither the number, nor the exact identity of processors participating in the barrier is known a priori. We will also show that a low-latency barrier synchronizer can be used not only for high-speed barrier synchronization but also, very profitably, for implementing software combining (allowing distributed hot spot accessing), for data and producer-consumer type synchronization and for the implementation of a variety of other useful applications. A high-speed barrier synchronizer can also be used to implement highly concurrent data structures and will also allow a MIMD (Multiple Instruction streams, Multiple Data streams) system to be effectively operated in a SIMD (Single Instruction stream, Multiple Data streams)-style mode, giving rise to a number of potential advantages. We use simulations to confirm that our proposed synchronizers and their applications outperform the existing barrier synchronization schemes.","PeriodicalId":100806,"journal":{"name":"Journal of Microcomputer Applications","volume":"9 3","pages":"197-221"},"PeriodicalIF":0.0000,"publicationDate":"1994-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Efficient barrier synchronization techniques and their applications in large-scale shared memory multiprocessors\",\"authors\":\"K. Ghose, D. Cheng\",\"doi\":\"10.1006/JMCA.1994.1012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Shared memory multiprocessors offer a relatively simple programming model and are suitable for a wide variety of parallel applications. Unfortunately, shared memory multiprocessors are not as scalable as distributed memory multiprocessors owing to memory and switch contentions that can result in the formation of hot spots. Spinning on synchronization variables appears to be the main culprit behind the formation of hot spots, affecting system scalability adversely. The purpose of this paper is to address the issue of performing barrier synchronization efficiently in large-scale shared memory multiprocessors. We propose a very simple design for a hardware barrier synchronizer that has the characteristics of what one would call an ideal barrier synchronizer. In particular, the proposed barrier synchronizer allows fast barrier synchronization without injecting spin traffic to create hot spots and can be reused as soon as it has completed a barrier synchronization. We also show that by augmenting this barrier synchronizer with a few gates, it can be used to perform dynamic barrier synchronization, where neither the number, nor the exact identity of processors participating in the barrier is known a priori. We will also show that a low-latency barrier synchronizer can be used not only for high-speed barrier synchronization but also, very profitably, for implementing software combining (allowing distributed hot spot accessing), for data and producer-consumer type synchronization and for the implementation of a variety of other useful applications. A high-speed barrier synchronizer can also be used to implement highly concurrent data structures and will also allow a MIMD (Multiple Instruction streams, Multiple Data streams) system to be effectively operated in a SIMD (Single Instruction stream, Multiple Data streams)-style mode, giving rise to a number of potential advantages. We use simulations to confirm that our proposed synchronizers and their applications outperform the existing barrier synchronization schemes.\",\"PeriodicalId\":100806,\"journal\":{\"name\":\"Journal of Microcomputer Applications\",\"volume\":\"9 3\",\"pages\":\"197-221\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1994-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Microcomputer Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1006/JMCA.1994.1012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Microcomputer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1006/JMCA.1994.1012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

共享内存多处理器提供了一种相对简单的编程模型,适用于各种并行应用。不幸的是,共享内存多处理器的可伸缩性不如分布式内存多处理器,因为内存和交换机争用可能导致热点的形成。旋转同步变量似乎是热点形成背后的罪魁祸首,对系统的可伸缩性产生不利影响。本文的目的是解决大规模共享内存多处理器中有效执行屏障同步的问题。我们提出了一个非常简单的硬件屏障同步器设计,它具有所谓的理想屏障同步器的特征。特别是,所提出的屏障同步器允许快速屏障同步,而无需注入自旋流量来创建热点,并且可以在完成屏障同步后立即重用。我们还表明,通过用几个门增加这个屏障同步器,它可以用于执行动态屏障同步,其中参与屏障的处理器的数量和确切身份都不是先验的。我们还将展示低延迟屏障同步器不仅可以用于高速屏障同步,而且还可以非常有利地用于实现软件组合(允许分布式热点访问)、数据和生产者-消费者类型同步以及实现各种其他有用的应用程序。高速屏障同步器也可用于实现高度并发的数据结构,并且还将允许MIMD(多指令流,多数据流)系统在SIMD(单指令流,多数据流)风格模式下有效运行,从而产生许多潜在的优势。我们使用仿真来证实我们提出的同步器及其应用程序优于现有的屏障同步方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient barrier synchronization techniques and their applications in large-scale shared memory multiprocessors
Abstract Shared memory multiprocessors offer a relatively simple programming model and are suitable for a wide variety of parallel applications. Unfortunately, shared memory multiprocessors are not as scalable as distributed memory multiprocessors owing to memory and switch contentions that can result in the formation of hot spots. Spinning on synchronization variables appears to be the main culprit behind the formation of hot spots, affecting system scalability adversely. The purpose of this paper is to address the issue of performing barrier synchronization efficiently in large-scale shared memory multiprocessors. We propose a very simple design for a hardware barrier synchronizer that has the characteristics of what one would call an ideal barrier synchronizer. In particular, the proposed barrier synchronizer allows fast barrier synchronization without injecting spin traffic to create hot spots and can be reused as soon as it has completed a barrier synchronization. We also show that by augmenting this barrier synchronizer with a few gates, it can be used to perform dynamic barrier synchronization, where neither the number, nor the exact identity of processors participating in the barrier is known a priori. We will also show that a low-latency barrier synchronizer can be used not only for high-speed barrier synchronization but also, very profitably, for implementing software combining (allowing distributed hot spot accessing), for data and producer-consumer type synchronization and for the implementation of a variety of other useful applications. A high-speed barrier synchronizer can also be used to implement highly concurrent data structures and will also allow a MIMD (Multiple Instruction streams, Multiple Data streams) system to be effectively operated in a SIMD (Single Instruction stream, Multiple Data streams)-style mode, giving rise to a number of potential advantages. We use simulations to confirm that our proposed synchronizers and their applications outperform the existing barrier synchronization schemes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信