GSpecPal: Speculation-Centric Finite State Machine Parallelization on GPUs

Yuguang Wang, Robbie Watling, Junqiao Qiu, Zhenlin Wang
{"title":"GSpecPal: Speculation-Centric Finite State Machine Parallelization on GPUs","authors":"Yuguang Wang, Robbie Watling, Junqiao Qiu, Zhenlin Wang","doi":"10.1109/ipdps53621.2022.00053","DOIUrl":null,"url":null,"abstract":"Finite State Machine (FSM) plays a critical role in many real-world applications, ranging from pattern matching to network security. In recent years, significant research efforts have been made to accelerate FSM computations on different parallel platforms, including multicores, GPUs, and DRAM-based accelerators. A popular direction is the speculation-centric parallelization. Despite their abundance and promising results, the benefits of speculation-centric FSM parallelization on GPUs heavily depend on high speculation accuracy and are greatly limited by the inefficient sequential recovery. Inspired by speculative data forwarding used in Thread Level Speculation (TLS), this work addresses the existing bottlenecks by introducing speculative recovery with two heuristics for thread scheduling, which can effectively remove redundant computations and increase the GPU thread utilization. To maximize the performance of running FSMs on GPUs, this work integrates different speculative parallelization schemes into a latency-sensitive framework, GSpecPal, along with a scheme selector which aims to automatically configure the optimal GPU-based parallelization for a given FSM. Evaluation on a set of real-world FSMs with diverse characteristics confirms the effectiveness of GSpecPal. Experimental results show that GSpecPal can obtain 7.2× speedup on average (up to 20×) over the state-of-the-art on an Nvidia GeForce RTX 3090 GPU.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps53621.2022.00053","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Finite State Machine (FSM) plays a critical role in many real-world applications, ranging from pattern matching to network security. In recent years, significant research efforts have been made to accelerate FSM computations on different parallel platforms, including multicores, GPUs, and DRAM-based accelerators. A popular direction is the speculation-centric parallelization. Despite their abundance and promising results, the benefits of speculation-centric FSM parallelization on GPUs heavily depend on high speculation accuracy and are greatly limited by the inefficient sequential recovery. Inspired by speculative data forwarding used in Thread Level Speculation (TLS), this work addresses the existing bottlenecks by introducing speculative recovery with two heuristics for thread scheduling, which can effectively remove redundant computations and increase the GPU thread utilization. To maximize the performance of running FSMs on GPUs, this work integrates different speculative parallelization schemes into a latency-sensitive framework, GSpecPal, along with a scheme selector which aims to automatically configure the optimal GPU-based parallelization for a given FSM. Evaluation on a set of real-world FSMs with diverse characteristics confirms the effectiveness of GSpecPal. Experimental results show that GSpecPal can obtain 7.2× speedup on average (up to 20×) over the state-of-the-art on an Nvidia GeForce RTX 3090 GPU.
GSpecPal: gpu上以推测为中心的有限状态机并行化
有限状态机(FSM)在从模式匹配到网络安全的许多实际应用中起着关键作用。近年来,在不同的并行平台(包括多核、gpu和基于dram的加速器)上进行了大量的FSM计算加速研究。一个流行的方向是以推测为中心的并行化。尽管有大量的结果,但以推测为中心的FSM并行化在gpu上的好处很大程度上依赖于高推测精度,并受到低效的顺序恢复的极大限制。受线程级别推测(TLS)中推测性数据转发的启发,本文通过引入推测性恢复和两种启发式线程调度来解决现有的瓶颈问题,有效地消除了冗余计算,提高了GPU线程利用率。为了最大限度地提高在gpu上运行FSM的性能,这项工作将不同的推测并行化方案集成到一个延迟敏感框架GSpecPal中,以及一个方案选择器,该选择器旨在为给定的FSM自动配置基于gpu的最佳并行化。对一组具有不同特征的真实fsm的评估证实了GSpecPal的有效性。实验结果表明,与Nvidia GeForce RTX 3090 GPU相比,GSpecPal平均可以获得7.2倍的加速(最高可达20倍)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信