OpenMP中多目标工作共享的位置感知内存关联

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) Pub Date : 2014-08-24 DOI:10.1145/2628071.2671428

T. Scogland, W. Feng

{"title":"OpenMP中多目标工作共享的位置感知内存关联","authors":"T. Scogland, W. Feng","doi":"10.1145/2628071.2671428","DOIUrl":null,"url":null,"abstract":"Heterogeneity is an ever-growing challenge in computing. The clearest example is the increasing popularity of GPUs, and purpose-designed coprocessors such as Intel Xeon Phi. Even disregarding coprocessors, heterogeneity continues to increase with the rise in CPU core counts, adaptive per-core frequencies, and increasingly hierarchical and complex memory systems. Take a system with four memory nodes, associated with four cores each, and four GPUs, each with a distinct address space and tens to hundreds of cores programmed like a bulk-synchronous parallel cluster. In this case, we are effectively programming clusters of miniature constellations in every node.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Locality-aware memory association for multi-target worksharing in OpenMP\",\"authors\":\"T. Scogland, W. Feng\",\"doi\":\"10.1145/2628071.2671428\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Heterogeneity is an ever-growing challenge in computing. The clearest example is the increasing popularity of GPUs, and purpose-designed coprocessors such as Intel Xeon Phi. Even disregarding coprocessors, heterogeneity continues to increase with the rise in CPU core counts, adaptive per-core frequencies, and increasingly hierarchical and complex memory systems. Take a system with four memory nodes, associated with four cores each, and four GPUs, each with a distinct address space and tens to hundreds of cores programmed like a bulk-synchronous parallel cluster. In this case, we are effectively programming clusters of miniature constellations in every node.\",\"PeriodicalId\":263670,\"journal\":{\"name\":\"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2628071.2671428\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2628071.2671428","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

异构性是计算领域不断增长的挑战。最明显的例子是gpu的日益普及，以及Intel Xeon Phi等专门设计的协处理器。即使不考虑协处理器，异构性也会随着CPU核数、自适应单核频率以及日益分层和复杂的内存系统的增加而继续增加。以具有四个内存节点的系统为例，每个节点与四个内核相关联，还有四个gpu，每个gpu都有不同的地址空间，并且像批量同步并行集群一样编程了数十到数百个内核。在这种情况下，我们在每个节点上有效地编程微型星座集群。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Locality-aware memory association for multi-target worksharing in OpenMP

Heterogeneity is an ever-growing challenge in computing. The clearest example is the increasing popularity of GPUs, and purpose-designed coprocessors such as Intel Xeon Phi. Even disregarding coprocessors, heterogeneity continues to increase with the rise in CPU core counts, adaptive per-core frequencies, and increasingly hierarchical and complex memory systems. Take a system with four memory nodes, associated with four cores each, and four GPUs, each with a distinct address space and tens to hundreds of cores programmed like a bulk-synchronous parallel cluster. In this case, we are effectively programming clusters of miniature constellations in every node.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

自引率

0.00%

发文量