流式排序网络

M. Zuluaga, Peter Milder, Markus Püschel
{"title":"流式排序网络","authors":"M. Zuluaga, Peter Milder, Markus Püschel","doi":"10.1145/2854150","DOIUrl":null,"url":null,"abstract":"Sorting is a fundamental problem in computer science and has been studied extensively. Thus, a large variety of sorting methods exist for both software and hardware implementations. For the latter, there is a trade-off between the throughput achieved and the cost (i.e., the logic and storage invested to sort <i>n</i> elements). Two popular solutions are bitonic sorting networks with <i>O</i>(<i>n</i>log <sup>2</sup><i>n</i>) logic and storage, which sort <i>n</i> elements per cycle, and linear sorters with <i>O</i>(<i>n</i>) logic and storage, which sort <i>n</i> elements per <i>n</i> cycles. In this article, we present new hardware structures that we call <i>streaming sorting networks</i>, which we derive through a mathematical formalism that we introduce, and an accompanying domain-specific hardware generator that translates our formal mathematical description into synthesizable RTL Verilog. With the new networks, we achieve novel and improved cost-performance trade-offs. For example, assuming that <i>n</i> is a two-power and <i>w</i> is any divisor of <i>n</i>, one class of these networks can sort in <i>n</i>/;<i>w</i> cycles with <i>O</i>(<i>w</i>log <sup>2</sup><i>n</i>) logic and <i>O</i>(<i>n</i>log <sup>2</sup><i>n</i>) storage; the other class that we present sorts in <i>n</i>log <sup>2</sup><i>n</i>/;<i>w</i> cycles with <i>O</i>(<i>w</i>) logic and <i>O</i>(<i>n</i>) storage. We carefully analyze the performance of these networks and their cost at three levels of abstraction: (1) asymptotically, (2) exactly in terms of the number of basic elements needed, and (3) in terms of the resources required by the actual circuit when mapped to a field-programmable gate array. The accompanying hardware generator allows us to explore the entire design space, identify the Pareto-optimal solutions, and show superior cost-performance trade-offs compared to prior work.","PeriodicalId":7063,"journal":{"name":"ACM Trans. Design Autom. Electr. Syst.","volume":"22 1","pages":"55:1-55:30"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":"{\"title\":\"Streaming Sorting Networks\",\"authors\":\"M. Zuluaga, Peter Milder, Markus Püschel\",\"doi\":\"10.1145/2854150\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sorting is a fundamental problem in computer science and has been studied extensively. Thus, a large variety of sorting methods exist for both software and hardware implementations. For the latter, there is a trade-off between the throughput achieved and the cost (i.e., the logic and storage invested to sort <i>n</i> elements). Two popular solutions are bitonic sorting networks with <i>O</i>(<i>n</i>log <sup>2</sup><i>n</i>) logic and storage, which sort <i>n</i> elements per cycle, and linear sorters with <i>O</i>(<i>n</i>) logic and storage, which sort <i>n</i> elements per <i>n</i> cycles. In this article, we present new hardware structures that we call <i>streaming sorting networks</i>, which we derive through a mathematical formalism that we introduce, and an accompanying domain-specific hardware generator that translates our formal mathematical description into synthesizable RTL Verilog. With the new networks, we achieve novel and improved cost-performance trade-offs. For example, assuming that <i>n</i> is a two-power and <i>w</i> is any divisor of <i>n</i>, one class of these networks can sort in <i>n</i>/;<i>w</i> cycles with <i>O</i>(<i>w</i>log <sup>2</sup><i>n</i>) logic and <i>O</i>(<i>n</i>log <sup>2</sup><i>n</i>) storage; the other class that we present sorts in <i>n</i>log <sup>2</sup><i>n</i>/;<i>w</i> cycles with <i>O</i>(<i>w</i>) logic and <i>O</i>(<i>n</i>) storage. We carefully analyze the performance of these networks and their cost at three levels of abstraction: (1) asymptotically, (2) exactly in terms of the number of basic elements needed, and (3) in terms of the resources required by the actual circuit when mapped to a field-programmable gate array. The accompanying hardware generator allows us to explore the entire design space, identify the Pareto-optimal solutions, and show superior cost-performance trade-offs compared to prior work.\",\"PeriodicalId\":7063,\"journal\":{\"name\":\"ACM Trans. Design Autom. Electr. Syst.\",\"volume\":\"22 1\",\"pages\":\"55:1-55:30\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"34\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Trans. Design Autom. Electr. Syst.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2854150\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Design Autom. Electr. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2854150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 34

摘要

排序是计算机科学中的一个基本问题,已经得到了广泛的研究。因此,在软件和硬件实现中都存在各种各样的排序方法。对于后者,在实现的吞吐量和成本(即对n个元素排序所投入的逻辑和存储)之间进行权衡。两种流行的解决方案是具有O(nlog 2n)逻辑和存储的双元排序网络,每个循环对n个元素进行排序,以及具有O(n)逻辑和存储的线性排序器,每n个循环对n个元素进行排序。在本文中,我们介绍了称为流排序网络的新硬件结构,它是通过我们引入的数学形式化推导出来的,以及附带的特定于领域的硬件生成器,它将我们的形式化数学描述转换为可合成的RTL Verilog。有了新的网络,我们实现了新颖和改进的成本效益权衡。例如,假设n是二次幂,w是n的任意因数,一类网络可以在n/ w个周期内排序,逻辑为O(wlog 2n),存储为O(nlog 2n);我们介绍的另一类排序周期为nlog 2n/;w,逻辑为O(w),存储为O(n)。我们在三个抽象层次上仔细分析了这些网络的性能及其成本:(1)渐近地,(2)根据所需基本元素的数量精确地,以及(3)根据映射到现场可编程门阵列时实际电路所需的资源。附带的硬件生成器允许我们探索整个设计空间,确定帕累托最优解决方案,并与之前的工作相比显示优越的成本性能权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Streaming Sorting Networks
Sorting is a fundamental problem in computer science and has been studied extensively. Thus, a large variety of sorting methods exist for both software and hardware implementations. For the latter, there is a trade-off between the throughput achieved and the cost (i.e., the logic and storage invested to sort n elements). Two popular solutions are bitonic sorting networks with O(nlog 2n) logic and storage, which sort n elements per cycle, and linear sorters with O(n) logic and storage, which sort n elements per n cycles. In this article, we present new hardware structures that we call streaming sorting networks, which we derive through a mathematical formalism that we introduce, and an accompanying domain-specific hardware generator that translates our formal mathematical description into synthesizable RTL Verilog. With the new networks, we achieve novel and improved cost-performance trade-offs. For example, assuming that n is a two-power and w is any divisor of n, one class of these networks can sort in n/;w cycles with O(wlog 2n) logic and O(nlog 2n) storage; the other class that we present sorts in nlog 2n/;w cycles with O(w) logic and O(n) storage. We carefully analyze the performance of these networks and their cost at three levels of abstraction: (1) asymptotically, (2) exactly in terms of the number of basic elements needed, and (3) in terms of the resources required by the actual circuit when mapped to a field-programmable gate array. The accompanying hardware generator allows us to explore the entire design space, identify the Pareto-optimal solutions, and show superior cost-performance trade-offs compared to prior work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信