Pthreads Performance Characteristics on Shared Cache CMP, Private Cache CMP and SMP

I. Tan, I. Chai, Poo Kuan Hoong
{"title":"Pthreads Performance Characteristics on Shared Cache CMP, Private Cache CMP and SMP","authors":"I. Tan, I. Chai, Poo Kuan Hoong","doi":"10.1109/ICCEA.2010.44","DOIUrl":null,"url":null,"abstract":"With the wide availability of chip multi-processing (CMP), software developers are now facing the task of effectively parallelizing their software code. Once they have identified the areas of parallelization, they will need to know the level of code granularity needed to ensure profitable execution. Furthermore, this problem multiplies itself with different hardware available. In this paper, we present a novel approach for fair comparison of the hardware configuration by simulation through configuring a pair of quad-core processors. The simulated configuration represents shared cache CMP, private cache CMP and symmetrical multiprocessor (SMP) environment. We then present a modified lmbench micro-benchmark suite to measure the cost of threading on these different hardware configurations. In our empirical studies, we observe that shared cache CMP exhibits better performance when the operating systems load balancer is highly active. However, the measurements also indicate that thread size is an important consideration where potential cache trashing can occur when sharing a cache between processing cores. Private cache CMP and SMP do not exhibit significant difference in our measurements. The techniques presented can be incorporated into integrated development environment, compilers and potentially even other run-time environments.","PeriodicalId":207234,"journal":{"name":"2010 Second International Conference on Computer Engineering and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Second International Conference on Computer Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEA.2010.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

With the wide availability of chip multi-processing (CMP), software developers are now facing the task of effectively parallelizing their software code. Once they have identified the areas of parallelization, they will need to know the level of code granularity needed to ensure profitable execution. Furthermore, this problem multiplies itself with different hardware available. In this paper, we present a novel approach for fair comparison of the hardware configuration by simulation through configuring a pair of quad-core processors. The simulated configuration represents shared cache CMP, private cache CMP and symmetrical multiprocessor (SMP) environment. We then present a modified lmbench micro-benchmark suite to measure the cost of threading on these different hardware configurations. In our empirical studies, we observe that shared cache CMP exhibits better performance when the operating systems load balancer is highly active. However, the measurements also indicate that thread size is an important consideration where potential cache trashing can occur when sharing a cache between processing cores. Private cache CMP and SMP do not exhibit significant difference in our measurements. The techniques presented can be incorporated into integrated development environment, compilers and potentially even other run-time environments.
共享缓存CMP、私有缓存CMP和SMP的Pthreads性能特征
随着芯片多处理(CMP)的广泛应用,软件开发人员现在面临着有效并行化其软件代码的任务。一旦他们确定了并行化的领域,他们将需要知道确保有效执行所需的代码粒度级别。此外,这个问题会随着可用硬件的不同而成倍增加。在本文中,我们提出了一种新颖的方法,通过配置一对四核处理器来模拟硬件配置的公平比较。模拟的配置包括共享缓存CMP、私有缓存CMP和对称多处理器(SMP)环境。然后,我们提供了一个改进的lmbench微基准套件来测量这些不同硬件配置上的线程成本。在我们的实证研究中,我们观察到,当操作系统负载平衡器高度活跃时,共享缓存CMP表现出更好的性能。然而,测量结果还表明,线程大小是一个重要的考虑因素,在处理内核之间共享缓存时,可能会发生潜在的缓存垃圾。私有缓存CMP和SMP在我们的测量中没有显着差异。本文介绍的技术可以集成到集成开发环境、编译器甚至其他运行时环境中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信