Performance Analysis and Benchmarking of the Intel SCC

P. Gschwandtner, T. Fahringer, R. Prodan
{"title":"Performance Analysis and Benchmarking of the Intel SCC","authors":"P. Gschwandtner, T. Fahringer, R. Prodan","doi":"10.1109/CLUSTER.2011.24","DOIUrl":null,"url":null,"abstract":"Over the past years there has been a steady change in CPU design towards both many-core processors and power-aware hardware architectures. These two trends are combined in the Intel Single-chip Cloud Computer (SCC), an experimental prototype with 48 Pentium cores created by Intel Labs. The SCC is a highly configurable many-core chip which provides unique opportunities to optimize run time, communication and memory access as well as power/energy consumption of parallel programs. The aim of this paper is to characterize the performance behavior of the chip with various power settings, mappings of processes/cores to memory controllers, etc through benchmarking. Analytical models are used to verify and interpret the results. Conclusions drawn from our benchmark outcomes are that data exchange based on message passing is faster than shared memory data exchange. Contrary to popular belief, lowest energy consumption is not achieved for the fastest execution time. Furthermore in order to improve the memory access behavior one should increase the clock frequency of both, mesh network and memory controllers. In general, the results of our investigations can be used to analyze the effect of power settings and architecture properties on the performance and energy consumption of parallel programs as well as assist in choosing appropriate settings for specific workloads.","PeriodicalId":200830,"journal":{"name":"2011 IEEE International Conference on Cluster Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2011.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 44

Abstract

Over the past years there has been a steady change in CPU design towards both many-core processors and power-aware hardware architectures. These two trends are combined in the Intel Single-chip Cloud Computer (SCC), an experimental prototype with 48 Pentium cores created by Intel Labs. The SCC is a highly configurable many-core chip which provides unique opportunities to optimize run time, communication and memory access as well as power/energy consumption of parallel programs. The aim of this paper is to characterize the performance behavior of the chip with various power settings, mappings of processes/cores to memory controllers, etc through benchmarking. Analytical models are used to verify and interpret the results. Conclusions drawn from our benchmark outcomes are that data exchange based on message passing is faster than shared memory data exchange. Contrary to popular belief, lowest energy consumption is not achieved for the fastest execution time. Furthermore in order to improve the memory access behavior one should increase the clock frequency of both, mesh network and memory controllers. In general, the results of our investigations can be used to analyze the effect of power settings and architecture properties on the performance and energy consumption of parallel programs as well as assist in choosing appropriate settings for specific workloads.
英特尔SCC的性能分析与基准测试
在过去的几年中,CPU设计稳步向多核处理器和功耗感知硬件架构发展。这两种趋势结合在英特尔单芯片云计算机(SCC)中,这是一个由英特尔实验室创建的带有48个奔腾内核的实验原型。SCC是一种高度可配置的多核芯片,为优化并行程序的运行时间、通信和内存访问以及功率/能耗提供了独特的机会。本文的目的是通过基准测试来表征芯片在各种电源设置、进程/核心到内存控制器的映射等方面的性能行为。分析模型用于验证和解释结果。从基准测试结果中得出的结论是,基于消息传递的数据交换比共享内存数据交换更快。与普遍的看法相反,最快的执行时间并不能实现最低的能耗。此外,为了改善存储器访问行为,应该增加mesh网络和存储器控制器的时钟频率。一般来说,我们的调查结果可用于分析电源设置和架构属性对并行程序的性能和能耗的影响,并有助于为特定工作负载选择适当的设置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信