通过有原则的位猜测挑战顺序比特流处理

Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems Pub Date : 2020-03-09 DOI:10.1145/3373376.3378461

Junqiao Qiu, Lin Jiang, Zhijia Zhao

{"title":"通过有原则的位猜测挑战顺序比特流处理","authors":"Junqiao Qiu, Lin Jiang, Zhijia Zhao","doi":"10.1145/3373376.3378461","DOIUrl":null,"url":null,"abstract":"Many performance-critical applications traverse bitstreams with bitwise computations for better performance or higher space efficiency, such as multimedia processing and bitmap indexing. However, when these bitwise computations carry dependences, the entire bitstream traversal becomes serial, fundamentally limiting the scalability. In this work, we show that bitstream-carried dependences are actually \"breakable\" in many cases, with the adoption of a systematic treatment - principled bitwise speculation (PBS). The core idea of PBS stems from an analogy drawn between bitstream programs and sequential circuits, both of which transform binary sequences. In this new perspective, it becomes natural to model the dependences in bitstream programs with finite-state machines (FSM), a basic model for sequential circuits. To achieve this, PBS features an assembly of static analyses that reason about bitstream programs down to the bit level to identify the bits causing dependences, then it treats the value combinations of dependent bits as states to construct FSMs. The modeling, for the first time, enables the use of FSM speculation techniques to parallelize bitstream programs. Basically, by leveraging the state convergence of FSMs, the values of dependent bits can be predicted with much higher accuracies. In cases the prediction fails, PBS tries to directly \"rectify\" the wrong outputs based on bitwise logic, minimizing the mis-speculation costs. In addition, FSM shows even higher execution efficiency than the original program in some cases, making itself an optimized version to accelerate serial bitstream processing. We prototyped PBS using LLVM. Evaluation with real-world bitstream programs confirms the effectiveness of PBS, showing up to near-linear speedup on multicore/manycore machines.","PeriodicalId":108406,"journal":{"name":"Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Challenging Sequential Bitstream Processing via Principled Bitwise Speculation\",\"authors\":\"Junqiao Qiu, Lin Jiang, Zhijia Zhao\",\"doi\":\"10.1145/3373376.3378461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Many performance-critical applications traverse bitstreams with bitwise computations for better performance or higher space efficiency, such as multimedia processing and bitmap indexing. However, when these bitwise computations carry dependences, the entire bitstream traversal becomes serial, fundamentally limiting the scalability. In this work, we show that bitstream-carried dependences are actually \\\"breakable\\\" in many cases, with the adoption of a systematic treatment - principled bitwise speculation (PBS). The core idea of PBS stems from an analogy drawn between bitstream programs and sequential circuits, both of which transform binary sequences. In this new perspective, it becomes natural to model the dependences in bitstream programs with finite-state machines (FSM), a basic model for sequential circuits. To achieve this, PBS features an assembly of static analyses that reason about bitstream programs down to the bit level to identify the bits causing dependences, then it treats the value combinations of dependent bits as states to construct FSMs. The modeling, for the first time, enables the use of FSM speculation techniques to parallelize bitstream programs. Basically, by leveraging the state convergence of FSMs, the values of dependent bits can be predicted with much higher accuracies. In cases the prediction fails, PBS tries to directly \\\"rectify\\\" the wrong outputs based on bitwise logic, minimizing the mis-speculation costs. In addition, FSM shows even higher execution efficiency than the original program in some cases, making itself an optimized version to accelerate serial bitstream processing. We prototyped PBS using LLVM. Evaluation with real-world bitstream programs confirms the effectiveness of PBS, showing up to near-linear speedup on multicore/manycore machines.\",\"PeriodicalId\":108406,\"journal\":{\"name\":\"Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems\",\"volume\":\"121 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3373376.3378461\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373376.3378461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

许多性能关键型应用程序使用逐位计算遍历比特流，以获得更好的性能或更高的空间效率，例如多媒体处理和位图索引。然而，当这些按位计算带有依赖性时，整个位流遍历变成串行，从根本上限制了可伸缩性。在这项工作中，我们表明，在许多情况下，比特流携带的依赖关系实际上是“可破坏的”，采用了系统的处理方法-原则比特推测(PBS)。PBS的核心思想源于比特流程序和顺序电路之间的类比，两者都转换二进制序列。从这个新的角度来看，用有限状态机(FSM)来模拟比特流程序中的依赖关系就变得很自然了，有限状态机是顺序电路的基本模型。为了实现这一点，PBS提供了一组静态分析，将比特流程序推理到比特级别，以识别导致依赖的比特，然后将依赖比特的值组合视为构建fsm的状态。该建模首次允许使用FSM推测技术来并行化位流程序。基本上，通过利用fsm的状态收敛，可以以更高的精度预测相关位的值。在预测失败的情况下，PBS尝试直接“纠正”基于位逻辑的错误输出，最大限度地减少错误猜测的成本。此外，FSM在某些情况下的执行效率甚至高于原始程序，使其成为加速串行比特流处理的优化版本。我们使用LLVM创建了PBS原型。对真实比特流程序的评估证实了PBS的有效性，在多核/多核机器上显示出接近线性的加速。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Challenging Sequential Bitstream Processing via Principled Bitwise Speculation

Many performance-critical applications traverse bitstreams with bitwise computations for better performance or higher space efficiency, such as multimedia processing and bitmap indexing. However, when these bitwise computations carry dependences, the entire bitstream traversal becomes serial, fundamentally limiting the scalability. In this work, we show that bitstream-carried dependences are actually "breakable" in many cases, with the adoption of a systematic treatment - principled bitwise speculation (PBS). The core idea of PBS stems from an analogy drawn between bitstream programs and sequential circuits, both of which transform binary sequences. In this new perspective, it becomes natural to model the dependences in bitstream programs with finite-state machines (FSM), a basic model for sequential circuits. To achieve this, PBS features an assembly of static analyses that reason about bitstream programs down to the bit level to identify the bits causing dependences, then it treats the value combinations of dependent bits as states to construct FSMs. The modeling, for the first time, enables the use of FSM speculation techniques to parallelize bitstream programs. Basically, by leveraging the state convergence of FSMs, the values of dependent bits can be predicted with much higher accuracies. In cases the prediction fails, PBS tries to directly "rectify" the wrong outputs based on bitwise logic, minimizing the mis-speculation costs. In addition, FSM shows even higher execution efficiency than the original program in some cases, making itself an optimized version to accelerate serial bitstream processing. We prototyped PBS using LLVM. Evaluation with real-world bitstream programs confirms the effectiveness of PBS, showing up to near-linear speedup on multicore/manycore machines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems

自引率

0.00%

发文量