{"title":"Hardware–Algorithm Codesigned Low-Latency and Resource-Efficient OMP Accelerator for DOA Estimation on FPGA","authors":"Ruichang Jiang;Wenbin Ye","doi":"10.1109/TVLSI.2024.3462467","DOIUrl":null,"url":null,"abstract":"This article introduces an algorithm-hardware codesign optimized for low-latency and resource-efficient direction-of-arrival (DOA) estimation, employing a refined orthogonal matching pursuit (OMP) algorithm adept at handling the complexities of multisource detection, particularly in scenarios with closely spaced signal sources. At the algorithmic level, this approach incorporates a secondary correction mechanism (SCM) into the traditional OMP algorithm, significantly improving estimation accuracy and robustness. On the hardware front, a bespoke OMP accelerator has been developed, featuring a reconfigurable generic processing element (PE) array that supports various computational modes and leverages multilevel spectral peak search strategy and pipelining techniques to enhance computational efficiency. Experimental evaluations reveal that the proposed system achieves a root mean square error (RMSE) for DOA estimation of less than 0.3° in multisource conditions with a signal-to-noise ratio (SNR) of 20 dB. In addition, the deployment of the OMP accelerator on a Zynq XC7Z020 development board utilizes modest logic resources: 5.49k LUTs, 3.28k FFs, 11.5 BRAMs, and 32 DSPs. Furthermore, the design achieves a computational latency of <inline-formula> <tex-math>$2.83~\\mu \\text { s}$ </tex-math></inline-formula> for single-source estimation with eight antennas. This achievement reflects a reduction of approximately 17.8% in LUTs, 56.3% in FFs, and 5.7% in DSPs compared to current leading-edge technologies after normalization all while maintaining competitive estimation accuracy and favorable estimation rates.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"421-434"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10694732/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
This article introduces an algorithm-hardware codesign optimized for low-latency and resource-efficient direction-of-arrival (DOA) estimation, employing a refined orthogonal matching pursuit (OMP) algorithm adept at handling the complexities of multisource detection, particularly in scenarios with closely spaced signal sources. At the algorithmic level, this approach incorporates a secondary correction mechanism (SCM) into the traditional OMP algorithm, significantly improving estimation accuracy and robustness. On the hardware front, a bespoke OMP accelerator has been developed, featuring a reconfigurable generic processing element (PE) array that supports various computational modes and leverages multilevel spectral peak search strategy and pipelining techniques to enhance computational efficiency. Experimental evaluations reveal that the proposed system achieves a root mean square error (RMSE) for DOA estimation of less than 0.3° in multisource conditions with a signal-to-noise ratio (SNR) of 20 dB. In addition, the deployment of the OMP accelerator on a Zynq XC7Z020 development board utilizes modest logic resources: 5.49k LUTs, 3.28k FFs, 11.5 BRAMs, and 32 DSPs. Furthermore, the design achieves a computational latency of $2.83~\mu \text { s}$ for single-source estimation with eight antennas. This achievement reflects a reduction of approximately 17.8% in LUTs, 56.3% in FFs, and 5.7% in DSPs compared to current leading-edge technologies after normalization all while maintaining competitive estimation accuracy and favorable estimation rates.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.