Rouzbeh Molaei Imenabadi;Gregory R. Thoreson;Katherine G. Brown;Dinesh Bhatia
{"title":"FPGA-Accelerated CNN Reconstruction for Low-Power Sparse-Array Ultrasound Imaging","authors":"Rouzbeh Molaei Imenabadi;Gregory R. Thoreson;Katherine G. Brown;Dinesh Bhatia","doi":"10.1109/TUFFC.2025.3630483","DOIUrl":null,"url":null,"abstract":"Imaging of targeted organs, such as the urinary bladder, could be transformative for preventive healthcare and early disease diagnosis when used to assess their real-time function. However, wearable and portable ultrasound (US) imaging systems often face constraints related to power consumption, form factor, cost, and signal resolution, particularly for deep tissues like the bladder. High-accuracy platforms with large channel counts can generate data streams of up to 10 GB/s, posing significant challenges in reducing computational complexity, achieving power efficiency, and maintaining wireless connectivity. Recent advancements in wearable US sensors have demonstrated potential for low-power, unobtrusive solutions but often fail to meet the accuracy and efficiency needed in clinical settings. This work presents an algorithm-centric proof of concept that reconstructs missing US channels through field-programmable gate array (FPGA)-accelerated deep learning, effectively doubling the imaging aperture while halving analog front-end requirements. We developed a lightweight U-Net convolutional neural network (L-UNET) with 222 609 parameters, specifically optimized for sparse-array RF data reconstruction. The network is deployed on a deep learning processing unit (DPU) using mixed quantization-aware training (Mixed-QAT) that selectively applies 8-bit integer precision while preserving two critical layers at 16-bit floating point (FP), achieving mean-squared error (MSE) of 1.48 <inline-formula> <tex-math>$\\times$ </tex-math></inline-formula> 10 compared to 1.22 <inline-formula> <tex-math>$\\times$ </tex-math></inline-formula> 10 for 32-bit FP. The FPGA implementation leverages a single-core accelerator, executing inference in 221 ms/frame with deterministic latency suitable for real-time reconstruction. By processing only odd-indexed physical channels and inferring even-indexed channels through the convolutional neural network (CNN), our approach maintains B-mode image quality (peak signal-to-noise ratio (PSNR) >18 dB and structural similarity index (SSIM) > 0.5) while reducing data acquisition complexity. The system achieves 0.918-W average power consumption in a 32-channel configuration, demonstrating that CNNbased sparse-array reconstruction on embedded FPGAs offers a viable path toward fully integrated US monitoring systems.","PeriodicalId":13322,"journal":{"name":"IEEE transactions on ultrasonics, ferroelectrics, and frequency control","volume":"72 12","pages":"1618-1636"},"PeriodicalIF":3.7000,"publicationDate":"2025-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on ultrasonics, ferroelectrics, and frequency control","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11234914/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Imaging of targeted organs, such as the urinary bladder, could be transformative for preventive healthcare and early disease diagnosis when used to assess their real-time function. However, wearable and portable ultrasound (US) imaging systems often face constraints related to power consumption, form factor, cost, and signal resolution, particularly for deep tissues like the bladder. High-accuracy platforms with large channel counts can generate data streams of up to 10 GB/s, posing significant challenges in reducing computational complexity, achieving power efficiency, and maintaining wireless connectivity. Recent advancements in wearable US sensors have demonstrated potential for low-power, unobtrusive solutions but often fail to meet the accuracy and efficiency needed in clinical settings. This work presents an algorithm-centric proof of concept that reconstructs missing US channels through field-programmable gate array (FPGA)-accelerated deep learning, effectively doubling the imaging aperture while halving analog front-end requirements. We developed a lightweight U-Net convolutional neural network (L-UNET) with 222 609 parameters, specifically optimized for sparse-array RF data reconstruction. The network is deployed on a deep learning processing unit (DPU) using mixed quantization-aware training (Mixed-QAT) that selectively applies 8-bit integer precision while preserving two critical layers at 16-bit floating point (FP), achieving mean-squared error (MSE) of 1.48 $\times$ 10 compared to 1.22 $\times$ 10 for 32-bit FP. The FPGA implementation leverages a single-core accelerator, executing inference in 221 ms/frame with deterministic latency suitable for real-time reconstruction. By processing only odd-indexed physical channels and inferring even-indexed channels through the convolutional neural network (CNN), our approach maintains B-mode image quality (peak signal-to-noise ratio (PSNR) >18 dB and structural similarity index (SSIM) > 0.5) while reducing data acquisition complexity. The system achieves 0.918-W average power consumption in a 32-channel configuration, demonstrating that CNNbased sparse-array reconstruction on embedded FPGAs offers a viable path toward fully integrated US monitoring systems.
期刊介绍:
IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control includes the theory, technology, materials, and applications relating to: (1) the generation, transmission, and detection of ultrasonic waves and related phenomena; (2) medical ultrasound, including hyperthermia, bioeffects, tissue characterization and imaging; (3) ferroelectric, piezoelectric, and piezomagnetic materials, including crystals, polycrystalline solids, films, polymers, and composites; (4) frequency control, timing and time distribution, including crystal oscillators and other means of classical frequency control, and atomic, molecular and laser frequency control standards. Areas of interest range from fundamental studies to the design and/or applications of devices and systems.