shuffle:用于多核多处理器系统的锁争用感知线程调度的框架

K. Pusukuri, Rajiv Gupta, L. Bhuyan
{"title":"shuffle:用于多核多处理器系统的锁争用感知线程调度的框架","authors":"K. Pusukuri, Rajiv Gupta, L. Bhuyan","doi":"10.1145/2628071.2628074","DOIUrl":null,"url":null,"abstract":"On a cache-coherent multicore multiprocessor system, the performance of a multithreaded application with high lock contention is very sensitive to the distribution of application threads across multiple processors (or Sockets). This is because the distribution of threads impacts the frequency of lock transfers between Sockets, which in turn impacts the frequency of last-level cache (LLC) misses that lie on the critical path of execution. Since the latency of a LLC miss is high, an increase of LLC misses on the critical path increases both lock acquisition latency and critical section processing time. However, thread schedulers for operating systems, such as Solaris and Linux, are oblivious of the lock contention among multiple threads belonging to an application and therefore fail to deliver high performance for multithreaded applications. To alleviate the above problem, in this paper, we propose a scheduling framework called Shuffling, which migrates threads of a multithreaded program across Sockets so that threads seeking locks are more likely to find the locks on the same Socket. Shuffling reduces the time threads spend on acquiring locks and speeds up the execution of shared data accesses in the critical section, ultimately reducing the execution time of the application. We have implemented Shuffling on a 64-core Supermicro server running Oracle Solaris 11™ and evaluated it using a wide variety of 20 multithreaded programs with high lock contention. Our experiments show that Shuffling achieves up to 54% reduction in execution time and an average reduction of 13%. Moreover it does not require any changes to the application source code or the OS kernel.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Shuffling: A framework for lock contention aware thread scheduling for multicore multiprocessor systems\",\"authors\":\"K. Pusukuri, Rajiv Gupta, L. Bhuyan\",\"doi\":\"10.1145/2628071.2628074\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"On a cache-coherent multicore multiprocessor system, the performance of a multithreaded application with high lock contention is very sensitive to the distribution of application threads across multiple processors (or Sockets). This is because the distribution of threads impacts the frequency of lock transfers between Sockets, which in turn impacts the frequency of last-level cache (LLC) misses that lie on the critical path of execution. Since the latency of a LLC miss is high, an increase of LLC misses on the critical path increases both lock acquisition latency and critical section processing time. However, thread schedulers for operating systems, such as Solaris and Linux, are oblivious of the lock contention among multiple threads belonging to an application and therefore fail to deliver high performance for multithreaded applications. To alleviate the above problem, in this paper, we propose a scheduling framework called Shuffling, which migrates threads of a multithreaded program across Sockets so that threads seeking locks are more likely to find the locks on the same Socket. Shuffling reduces the time threads spend on acquiring locks and speeds up the execution of shared data accesses in the critical section, ultimately reducing the execution time of the application. We have implemented Shuffling on a 64-core Supermicro server running Oracle Solaris 11™ and evaluated it using a wide variety of 20 multithreaded programs with high lock contention. Our experiments show that Shuffling achieves up to 54% reduction in execution time and an average reduction of 13%. Moreover it does not require any changes to the application source code or the OS kernel.\",\"PeriodicalId\":263670,\"journal\":{\"name\":\"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)\",\"volume\":\"80 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2628071.2628074\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2628071.2628074","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

摘要

在缓存一致的多核多处理器系统上,具有高锁争用的多线程应用程序的性能对应用程序线程跨多个处理器(或套接字)的分布非常敏感。这是因为线程的分布会影响套接字之间锁传输的频率,这反过来又会影响位于执行关键路径上的最后一级缓存(LLC)丢失的频率。由于LLC miss的延迟很高,关键路径上LLC miss的增加会增加锁获取延迟和关键段处理时间。然而,操作系统(如Solaris和Linux)的线程调度器忽略了属于一个应用程序的多个线程之间的锁争用,因此无法为多线程应用程序提供高性能。为了缓解上述问题,在本文中,我们提出了一种称为shuffle的调度框架,它可以跨套接字迁移多线程程序的线程,以便寻找锁的线程更有可能在同一个套接字上找到锁。变换减少了线程花费在获取锁上的时间,并加快了关键区中共享数据访问的执行速度,最终减少了应用程序的执行时间。我们在运行Oracle Solaris 11™的64核Supermicro服务器上实现了shuffle,并使用20个具有高锁争用的多线程程序对其进行了评估。我们的实验表明,变换最多可以减少54%的执行时间,平均减少13%。此外,它不需要对应用程序源代码或操作系统内核进行任何更改。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Shuffling: A framework for lock contention aware thread scheduling for multicore multiprocessor systems
On a cache-coherent multicore multiprocessor system, the performance of a multithreaded application with high lock contention is very sensitive to the distribution of application threads across multiple processors (or Sockets). This is because the distribution of threads impacts the frequency of lock transfers between Sockets, which in turn impacts the frequency of last-level cache (LLC) misses that lie on the critical path of execution. Since the latency of a LLC miss is high, an increase of LLC misses on the critical path increases both lock acquisition latency and critical section processing time. However, thread schedulers for operating systems, such as Solaris and Linux, are oblivious of the lock contention among multiple threads belonging to an application and therefore fail to deliver high performance for multithreaded applications. To alleviate the above problem, in this paper, we propose a scheduling framework called Shuffling, which migrates threads of a multithreaded program across Sockets so that threads seeking locks are more likely to find the locks on the same Socket. Shuffling reduces the time threads spend on acquiring locks and speeds up the execution of shared data accesses in the critical section, ultimately reducing the execution time of the application. We have implemented Shuffling on a 64-core Supermicro server running Oracle Solaris 11™ and evaluated it using a wide variety of 20 multithreaded programs with high lock contention. Our experiments show that Shuffling achieves up to 54% reduction in execution time and an average reduction of 13%. Moreover it does not require any changes to the application source code or the OS kernel.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信