Hardware–Algorithm Codesigned Low-Latency and Resource-Efficient OMP Accelerator for DOA Estimation on FPGA

IF 2.8 2区工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-09-26 DOI:10.1109/TVLSI.2024.3462467

Ruichang Jiang;Wenbin Ye

{"title":"Hardware–Algorithm Codesigned Low-Latency and Resource-Efficient OMP Accelerator for DOA Estimation on FPGA","authors":"Ruichang Jiang;Wenbin Ye","doi":"10.1109/TVLSI.2024.3462467","DOIUrl":null,"url":null,"abstract":"This article introduces an algorithm-hardware codesign optimized for low-latency and resource-efficient direction-of-arrival (DOA) estimation, employing a refined orthogonal matching pursuit (OMP) algorithm adept at handling the complexities of multisource detection, particularly in scenarios with closely spaced signal sources. At the algorithmic level, this approach incorporates a secondary correction mechanism (SCM) into the traditional OMP algorithm, significantly improving estimation accuracy and robustness. On the hardware front, a bespoke OMP accelerator has been developed, featuring a reconfigurable generic processing element (PE) array that supports various computational modes and leverages multilevel spectral peak search strategy and pipelining techniques to enhance computational efficiency. Experimental evaluations reveal that the proposed system achieves a root mean square error (RMSE) for DOA estimation of less than 0.3° in multisource conditions with a signal-to-noise ratio (SNR) of 20 dB. In addition, the deployment of the OMP accelerator on a Zynq XC7Z020 development board utilizes modest logic resources: 5.49k LUTs, 3.28k FFs, 11.5 BRAMs, and 32 DSPs. Furthermore, the design achieves a computational latency of <inline-formula> <tex-math>$2.83~\\mu \\text { s}$ </tex-math></inline-formula> for single-source estimation with eight antennas. This achievement reflects a reduction of approximately 17.8% in LUTs, 56.3% in FFs, and 5.7% in DSPs compared to current leading-edge technologies after normalization all while maintaining competitive estimation accuracy and favorable estimation rates.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"421-434"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10694732/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

This article introduces an algorithm-hardware codesign optimized for low-latency and resource-efficient direction-of-arrival (DOA) estimation, employing a refined orthogonal matching pursuit (OMP) algorithm adept at handling the complexities of multisource detection, particularly in scenarios with closely spaced signal sources. At the algorithmic level, this approach incorporates a secondary correction mechanism (SCM) into the traditional OMP algorithm, significantly improving estimation accuracy and robustness. On the hardware front, a bespoke OMP accelerator has been developed, featuring a reconfigurable generic processing element (PE) array that supports various computational modes and leverages multilevel spectral peak search strategy and pipelining techniques to enhance computational efficiency. Experimental evaluations reveal that the proposed system achieves a root mean square error (RMSE) for DOA estimation of less than 0.3° in multisource conditions with a signal-to-noise ratio (SNR) of 20 dB. In addition, the deployment of the OMP accelerator on a Zynq XC7Z020 development board utilizes modest logic resources: 5.49k LUTs, 3.28k FFs, 11.5 BRAMs, and 32 DSPs. Furthermore, the design achieves a computational latency of

$2.83~\mu \text { s}$

for single-source estimation with eight antennas. This achievement reflects a reduction of approximately 17.8% in LUTs, 56.3% in FFs, and 5.7% in DSPs compared to current leading-edge technologies after normalization all while maintaining competitive estimation accuracy and favorable estimation rates.

查看原文本刊更多论文

基于FPGA的低延迟、资源高效的OMP加速器

本文介绍了一种针对低延迟和资源高效的到达方向（DOA）估计进行优化的算法-硬件协同设计，采用了一种精细化的正交匹配追踪（OMP）算法，该算法擅长处理多源检测的复杂性，特别是在信号源间隔很近的情况下。在算法层面，该方法在传统的OMP算法中加入了二次校正机制（SCM），显著提高了估计精度和鲁棒性。在硬件方面，定制的OMP加速器已经开发出来，具有可重构的通用处理元件（PE）阵列，支持各种计算模式，并利用多级谱峰搜索策略和流水线技术来提高计算效率。实验结果表明，该系统在多源条件下的DOA估计均方根误差（RMSE）小于0.3°，信噪比（SNR）为20 dB。此外，在Zynq XC7Z020开发板上部署OMP加速器占用了适度的逻辑资源：5.49k lut, 3.28k ff， 11.5 bram和32个dsp。此外，该设计实现了8根天线单源估计的计算延迟为$2.83~\mu \text {s}$。这一成就反映了与目前的前沿技术相比，在归一化后，lut降低了约17.8%，ff降低了56.3%，dsp降低了5.7%，同时保持了具有竞争力的估计精度和有利的估计率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Very Large Scale Integration (VLSI) Systems 工程技术-工程：电子与电气

CiteScore

6.40

自引率

7.10%

发文量

187

审稿时长

3.6 months

期刊介绍： The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.