CSSMT: Compiler Based Software Simultaneous Multithreading (SMT)

Yuanfang Chen, Qingchuan Shi, Xiaoming Li
{"title":"CSSMT: Compiler Based Software Simultaneous Multithreading (SMT)","authors":"Yuanfang Chen, Qingchuan Shi, Xiaoming Li","doi":"10.1109/PDP2018.2018.00017","DOIUrl":null,"url":null,"abstract":"Simultaneous multithreading (SMT) is a unique computer architecture feature to increase the pipeline utilization and therefore, increase the instruction throughput. It improves instruction level throughput by simultaneously filling both vertical and horizontal super-scalar pipeline slots that are left unfilled by native threads. So far SMT has been implemented in hardware. However, hardware SMT implementations have its limitations. First, it is complex and expensive to implement—only higher-end processors are equipped with it even though lower-end processors have same pipeline design as the higher-end variants and might also benefit from it. SMT also introduces great power/energy and area overheads. Moreover, being a hardware feature, SMT is limited by the range and depth of instruction analysis that it can afford at execution time, therefore it is unlikely to benefit from high-level software knowledge about instruction mix and might lose many improvement opportunities. In this paper, we address the limitation of the hardware-based SMT and introduce CSSMT: Compiler based Software Simultaneous Multithreading (SMT). The main contribution of CSSMT is that it exploits high- level program profiles to purposefully \"re-mix\" instructions from multiple programs to better fill vertical and horizontal super- scalar pipeline slots so that the overall throughput is improved. Furthermore, CSSMT is a software-transformation technique that enables SMT at software level during compilation time. Therefore, it can help overcome the limitation of the hardware-based SMT implementation and is more portable. We test CSSMT with programs from SPEC2006 and NAS benchmarks and achieve up to 12% speedup of execution time (30.7% improvement in terms of multi-program throughput).","PeriodicalId":333367,"journal":{"name":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP2018.2018.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Simultaneous multithreading (SMT) is a unique computer architecture feature to increase the pipeline utilization and therefore, increase the instruction throughput. It improves instruction level throughput by simultaneously filling both vertical and horizontal super-scalar pipeline slots that are left unfilled by native threads. So far SMT has been implemented in hardware. However, hardware SMT implementations have its limitations. First, it is complex and expensive to implement—only higher-end processors are equipped with it even though lower-end processors have same pipeline design as the higher-end variants and might also benefit from it. SMT also introduces great power/energy and area overheads. Moreover, being a hardware feature, SMT is limited by the range and depth of instruction analysis that it can afford at execution time, therefore it is unlikely to benefit from high-level software knowledge about instruction mix and might lose many improvement opportunities. In this paper, we address the limitation of the hardware-based SMT and introduce CSSMT: Compiler based Software Simultaneous Multithreading (SMT). The main contribution of CSSMT is that it exploits high- level program profiles to purposefully "re-mix" instructions from multiple programs to better fill vertical and horizontal super- scalar pipeline slots so that the overall throughput is improved. Furthermore, CSSMT is a software-transformation technique that enables SMT at software level during compilation time. Therefore, it can help overcome the limitation of the hardware-based SMT implementation and is more portable. We test CSSMT with programs from SPEC2006 and NAS benchmarks and achieve up to 12% speedup of execution time (30.7% improvement in terms of multi-program throughput).
CSSMT:基于编译器的软件同步多线程(SMT)
同步多线程(SMT)是一种独特的计算机体系结构特征,可以提高管道利用率,从而提高指令吞吐量。它通过同时填充原生线程未填充的垂直和水平超标量管道槽来提高指令级吞吐量。到目前为止,SMT已经在硬件上实现了。然而,硬件SMT实现有其局限性。首先,它的实现既复杂又昂贵——只有高端处理器配备了它,即使低端处理器具有与高端变体相同的流水线设计,并且也可能从中受益。SMT还带来了巨大的电力/能源和面积开销。此外,作为一种硬件特性,SMT受限于它在执行时所能提供的指令分析的范围和深度,因此它不太可能从有关指令组合的高级软件知识中获益,而且可能会失去许多改进机会。本文针对基于硬件的SMT的局限性,介绍了基于编译器的软件同步多线程(SMT)。CSSMT的主要贡献在于它利用高级程序配置文件有目的地“重新混合”来自多个程序的指令,以更好地填充垂直和水平的超标量管道槽,从而提高总体吞吐量。此外,CSSMT是一种软件转换技术,可以在编译期间在软件级别启用SMT。因此,它可以帮助克服基于硬件的SMT实现的限制,并且更具可移植性。我们用SPEC2006和NAS基准测试中的程序测试CSSMT,并实现了高达12%的执行时间加速(在多程序吞吐量方面提高了30.7%)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信