Efficient and scalable barrier synchronization for many-core CMPs

Proceedings of the 7th ACM international conference on Computing frontiers Pub Date : 2010-05-17 DOI:10.1145/1787275.1787289

José L. Abellán, Juan Fernández, M. Acacio

引用次数: 11

Abstract

We present in this work a novel hardware-based barrier mechanism for synchronization on many-core CMPs. In particular, we leverage global interconnection lines (G-lines) and S-CSMA technique, which have been used to overcome some limitations of a flow control mechanism (EVC) in the context of Networks-on-Chip, to develop a simple G-lines-based network that operates independently of the main data network in order to carry out barrier synchronizations. Next, we evaluate our approach by running several applications on top of the Sim-PowerCMP performance simulator. Our method only takes 4 cycles to carry out the synchronization once all cores or threads have arrived at the barrier. Hence, we obtain much better performance results than software-based barrier implementations in terms of scalability and efficiency.

查看原文本刊更多论文

多核cmp的高效可扩展屏障同步

在这项工作中，我们提出了一种新的基于硬件的多核cmp同步屏障机制。特别是，我们利用全球互连线(g -line)和S-CSMA技术，它们已被用于克服片上网络背景下流量控制机制(EVC)的一些限制，以开发一个简单的基于g -line的网络，该网络独立于主数据网络运行，以便进行屏障同步。接下来，我们通过在Sim-PowerCMP性能模拟器上运行几个应用程序来评估我们的方法。我们的方法只需要4个周期来执行同步，一旦所有的内核或线程都到达了屏障。因此，在可伸缩性和效率方面，我们获得了比基于软件的屏障实现更好的性能结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 7th ACM international conference on Computing frontiers

自引率

0.00%

发文量