When less is more (LIMO):controlled parallelism forimproved efficiency

Gaurav Chadha, S. Mahlke, S. Narayanasamy
{"title":"When less is more (LIMO):controlled parallelism forimproved efficiency","authors":"Gaurav Chadha, S. Mahlke, S. Narayanasamy","doi":"10.1145/2380403.2380431","DOIUrl":null,"url":null,"abstract":"While developing shared-memory programs, programmers often contend with the problem of how many threads to create for best efficiency. Creating as many threads as the number of available processor cores, or more, may not be the most efficient configuration. Too many threads can result in excessive contention for shared resources, wasting energy, which is of primary concern for embedded devices. Furthermore, thermal and power constraints prevent us from operating all the processor cores at the highest possible frequency, favoring fewer threads. The best number of threads to run depends on the application, user input and hardware resources available. It can also change at runtime making it infeasible for the programmer to determine this number.\n To address this problem, we propose LIMO, a runtime system that dynamically manages the number of running threads of an application for maximizing peformance and energy-efficiency. LIMO monitors threads' progress along with the usage of shared hardware resources to determine the best number of threads to run and the voltage and frequency level. With dynamic adaptation, LIMO provides an average of 21% performance improvement and a 2x improvement in energy-efficiency on a 32-core system over the default configuration of 32 threads for a set of concurrent applications from the PARSEC suite, the Apache web server, and the Sphinx speech recognition system.","PeriodicalId":136293,"journal":{"name":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Compilers, Architecture, and Synthesis for Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2380403.2380431","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

Abstract

While developing shared-memory programs, programmers often contend with the problem of how many threads to create for best efficiency. Creating as many threads as the number of available processor cores, or more, may not be the most efficient configuration. Too many threads can result in excessive contention for shared resources, wasting energy, which is of primary concern for embedded devices. Furthermore, thermal and power constraints prevent us from operating all the processor cores at the highest possible frequency, favoring fewer threads. The best number of threads to run depends on the application, user input and hardware resources available. It can also change at runtime making it infeasible for the programmer to determine this number. To address this problem, we propose LIMO, a runtime system that dynamically manages the number of running threads of an application for maximizing peformance and energy-efficiency. LIMO monitors threads' progress along with the usage of shared hardware resources to determine the best number of threads to run and the voltage and frequency level. With dynamic adaptation, LIMO provides an average of 21% performance improvement and a 2x improvement in energy-efficiency on a 32-core system over the default configuration of 32 threads for a set of concurrent applications from the PARSEC suite, the Apache web server, and the Sphinx speech recognition system.
当少即是多(LIMO):控制并行以提高效率
在开发共享内存程序时,程序员经常要解决创建多少线程才能获得最佳效率的问题。创建与可用处理器内核数量相同或更多的线程可能不是最有效的配置。太多的线程会导致对共享资源的过度争用,浪费能源,这是嵌入式设备主要关心的问题。此外,热和功率限制使我们无法以尽可能高的频率运行所有处理器内核,从而支持更少的线程。运行的最佳线程数取决于应用程序、用户输入和可用的硬件资源。它也可能在运行时发生变化,使得程序员无法确定这个数字。为了解决这个问题,我们提出了LIMO,一个动态管理应用程序运行线程数量的运行时系统,以最大化性能和能源效率。LIMO监视线程的进程以及共享硬件资源的使用情况,以确定要运行的最佳线程数以及电压和频率级别。对于来自PARSEC套件、Apache web服务器和Sphinx语音识别系统的一组并发应用程序,与默认配置的32个线程相比,LIMO在32核系统上提供了平均21%的性能改进和2倍的能效改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信