Muxuan Gao;Juntao Jiang;Shuangming Lei;Huifeng Wu;Jun Chen;Yong Liu
{"title":"OnSort: An O(n) Comparison-Free Sorter for Large-Scale Dataset With Parallel Prefetching and Sparse-Aware Mechanism","authors":"Muxuan Gao;Juntao Jiang;Shuangming Lei;Huifeng Wu;Jun Chen;Yong Liu","doi":"10.1109/TCSII.2025.3570797","DOIUrl":null,"url":null,"abstract":"This brief proposes OnSort, a parallel comparison-free sorting architecture with <inline-formula> <tex-math>$\\mathcal {O}(n)$ </tex-math></inline-formula> time complexity, utilizing the SRAM structure to support large-scale datasets efficiently. The performance of existing comparison-free sorters is limited by uneven value distribution and variable element numbers. To address these issues, we introduce a parallel prefetching strategy to accelerate the indexing process and a sparse-aware mechanism to narrow the indexing search range. Furthermore, OnSort implements streaming execution through a pipelined design, thereby optimizing the previously overlooked latency of the counting phase. Experimental results show that, under the configuration of sorting 65,536 16-bit data elements, OnSort achieves a <inline-formula> <tex-math>$1.97\\times $ </tex-math></inline-formula> speedup and a <inline-formula> <tex-math>$22.6\\times $ </tex-math></inline-formula> throughput-to-area ratio compared to the existing design. The source code is available at <uri>https://github.com/gmx-hub/OnSort</uri>.","PeriodicalId":13101,"journal":{"name":"IEEE Transactions on Circuits and Systems II: Express Briefs","volume":"72 7","pages":"933-937"},"PeriodicalIF":4.9000,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems II: Express Briefs","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11006127/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
This brief proposes OnSort, a parallel comparison-free sorting architecture with $\mathcal {O}(n)$ time complexity, utilizing the SRAM structure to support large-scale datasets efficiently. The performance of existing comparison-free sorters is limited by uneven value distribution and variable element numbers. To address these issues, we introduce a parallel prefetching strategy to accelerate the indexing process and a sparse-aware mechanism to narrow the indexing search range. Furthermore, OnSort implements streaming execution through a pipelined design, thereby optimizing the previously overlooked latency of the counting phase. Experimental results show that, under the configuration of sorting 65,536 16-bit data elements, OnSort achieves a $1.97\times $ speedup and a $22.6\times $ throughput-to-area ratio compared to the existing design. The source code is available at https://github.com/gmx-hub/OnSort.
期刊介绍:
TCAS II publishes brief papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes:
Circuits: Analog, Digital and Mixed Signal Circuits and Systems
Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic
Circuits and Systems, Power Electronics and Systems
Software for Analog-and-Logic Circuits and Systems
Control aspects of Circuits and Systems.