BLQ:用于无阻塞队列的轻量级位置感知运行时

Qinzhe Wu, Ruihao Li, Jonathan Beard, L. K. John
{"title":"BLQ:用于无阻塞队列的轻量级位置感知运行时","authors":"Qinzhe Wu, Ruihao Li, Jonathan Beard, L. K. John","doi":"10.1145/3640537.3641568","DOIUrl":null,"url":null,"abstract":"Message queues are used widely in parallel processing systems for worker thread synchronization. When there is a throughput mismatch between the upstream and down-stream tasks, the message queue buffer will often exist as either empty or full. Polling on an empty or full queue will affect the performance of upstream or downstream threads, since such polling cycles could have been spent on other computation. Non-blocking queue is an alternative that allow polling cycles to be spared for other tasks per applications’ choice. However, application programmers are not supposed to bear the burden, because a good decision of what to do upon blocking has to take many runtime environment information into consideration. This paper proposes Blocking-Less Queuing Runtime ( BLQ ), a systematic solution capable of finding the proper strategies at (or before) blocking, as well as lightening the programmers’ burden. BLQ collects a set of solutions, including yielding, advanced dynamic queue buffer resizing, and resource-aware task scheduling. The evaluation on high-end servers shows that a set of diverse parallel queuing workloads could reduce blocking and lower cache misses with BLQ . BLQ outperforms the baseline runtime considerably (with up to 3 . 8 × peak speedup). CCS","PeriodicalId":147184,"journal":{"name":"International Conference on Compiler Construction","volume":"192 ","pages":"100-112"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BLQ: Light-Weight Locality-Aware Runtime for Blocking-Less Queuing\",\"authors\":\"Qinzhe Wu, Ruihao Li, Jonathan Beard, L. K. John\",\"doi\":\"10.1145/3640537.3641568\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Message queues are used widely in parallel processing systems for worker thread synchronization. When there is a throughput mismatch between the upstream and down-stream tasks, the message queue buffer will often exist as either empty or full. Polling on an empty or full queue will affect the performance of upstream or downstream threads, since such polling cycles could have been spent on other computation. Non-blocking queue is an alternative that allow polling cycles to be spared for other tasks per applications’ choice. However, application programmers are not supposed to bear the burden, because a good decision of what to do upon blocking has to take many runtime environment information into consideration. This paper proposes Blocking-Less Queuing Runtime ( BLQ ), a systematic solution capable of finding the proper strategies at (or before) blocking, as well as lightening the programmers’ burden. BLQ collects a set of solutions, including yielding, advanced dynamic queue buffer resizing, and resource-aware task scheduling. The evaluation on high-end servers shows that a set of diverse parallel queuing workloads could reduce blocking and lower cache misses with BLQ . BLQ outperforms the baseline runtime considerably (with up to 3 . 8 × peak speedup). CCS\",\"PeriodicalId\":147184,\"journal\":{\"name\":\"International Conference on Compiler Construction\",\"volume\":\"192 \",\"pages\":\"100-112\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Compiler Construction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3640537.3641568\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Compiler Construction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3640537.3641568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在并行处理系统中,消息队列被广泛用于工作线程同步。当上下游任务之间出现吞吐量不匹配时,消息队列通常会以空或满的状态存在。在空队列或满队列上轮询会影响上游或下游线程的性能,因为这些轮询周期本可以用于其他计算。非阻塞队列是一种替代方案,可以根据应用程序的选择,将轮询周期留给其他任务。然而,应用程序程序员不应该承担这个负担,因为在阻塞时做出正确的决定必须考虑到许多运行时环境信息。本文提出的无阻塞队列运行时(BLQ)是一种系统化的解决方案,能够在阻塞时(或阻塞前)找到合适的策略,并减轻程序员的负担。BLQ 集合了一系列解决方案,包括让渡、高级动态队列大小调整和资源感知任务调度。在高端服务器上进行的评估表明,使用BLQ,一组不同的并行队列工作负载可以减少阻塞,降低缓存缺失。BLQ的运行时间大大超过了基准运行时间(峰值速度提高了3.8倍)。CCS
本文章由计算机程序翻译,如有差异,请以英文原文为准。
BLQ: Light-Weight Locality-Aware Runtime for Blocking-Less Queuing
Message queues are used widely in parallel processing systems for worker thread synchronization. When there is a throughput mismatch between the upstream and down-stream tasks, the message queue buffer will often exist as either empty or full. Polling on an empty or full queue will affect the performance of upstream or downstream threads, since such polling cycles could have been spent on other computation. Non-blocking queue is an alternative that allow polling cycles to be spared for other tasks per applications’ choice. However, application programmers are not supposed to bear the burden, because a good decision of what to do upon blocking has to take many runtime environment information into consideration. This paper proposes Blocking-Less Queuing Runtime ( BLQ ), a systematic solution capable of finding the proper strategies at (or before) blocking, as well as lightening the programmers’ burden. BLQ collects a set of solutions, including yielding, advanced dynamic queue buffer resizing, and resource-aware task scheduling. The evaluation on high-end servers shows that a set of diverse parallel queuing workloads could reduce blocking and lower cache misses with BLQ . BLQ outperforms the baseline runtime considerably (with up to 3 . 8 × peak speedup). CCS
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信