Automated Microbubble Discrimination in Ultrasound Localization Microscopy by Vision Transformer

IF 3.7 2区工程技术 Q1 ACOUSTICS

IEEE transactions on ultrasonics, ferroelectrics, and frequency control Pub Date : 2025-03-15 DOI:10.1109/TUFFC.2025.3570496

Renxian Wang;Wei-Ning Lee

{"title":"Automated Microbubble Discrimination in Ultrasound Localization Microscopy by Vision Transformer","authors":"Renxian Wang;Wei-Ning Lee","doi":"10.1109/TUFFC.2025.3570496","DOIUrl":null,"url":null,"abstract":"Ultrasound localization microscopy (ULM) has revolutionized microvascular imaging by breaking the acoustic diffraction limit. However, different ULM workflows depend heavily on distinct prior knowledge, such as the impulse response and empirical selection of parameters (e.g., the number of microbubbles (MBs) per frame M), or the consistency of training-test dataset in deep learning (DL)-based studies. We hereby propose a general ULM pipeline that reduces priors. Our approach leverages a DL model that simultaneously distills MB signals and reduces speckles from every frame without estimating the impulse response and M. Our method features an efficient channel attention Vision Transformer (ViT) and a progressive learning strategy, enabling it to learn global information through training on progressively increasing patch sizes. Ample synthetic data were generated using the k-Wave toolbox to simulate various MB patterns, thus overcoming the deficiency of labeled data. The ViT output was further processed by a standard radial symmetry (RS) method for subpixel localization. Our method performed well on model-unseen public datasets: one in silico dataset with ground truth (GT) and four in vivo datasets of mouse tumor, rat brain, rat brain bolus, and rat kidney. Our pipeline outperformed conventional ULM, achieving higher positive predictive values (precision in DL, 0.88–0.41 versus 0.83–0.16) and improved accuracy (root-mean-square errors (RMSEs): 0.25–<inline-formula> <tex-math>$0.14~\\lambda $ </tex-math></inline-formula> versus 0.31–<inline-formula> <tex-math>$0.13~\\lambda $ </tex-math></inline-formula>) across a range of signal-to-noise ratios (SNRs) from 60 to 10 dB. Our model could detect more vessels in diverse in vivo datasets while achieving comparable resolutions to the standard method. The proposed ViT-based model, seamlessly integrated with state-of-the-art downstream ULM steps, improved the overall ULM performance with no priors.","PeriodicalId":13322,"journal":{"name":"IEEE transactions on ultrasonics, ferroelectrics, and frequency control","volume":"72 8","pages":"1134-1146"},"PeriodicalIF":3.7000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on ultrasonics, ferroelectrics, and frequency control","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/11005510/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ACOUSTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Ultrasound localization microscopy (ULM) has revolutionized microvascular imaging by breaking the acoustic diffraction limit. However, different ULM workflows depend heavily on distinct prior knowledge, such as the impulse response and empirical selection of parameters (e.g., the number of microbubbles (MBs) per frame M), or the consistency of training-test dataset in deep learning (DL)-based studies. We hereby propose a general ULM pipeline that reduces priors. Our approach leverages a DL model that simultaneously distills MB signals and reduces speckles from every frame without estimating the impulse response and M. Our method features an efficient channel attention Vision Transformer (ViT) and a progressive learning strategy, enabling it to learn global information through training on progressively increasing patch sizes. Ample synthetic data were generated using the k-Wave toolbox to simulate various MB patterns, thus overcoming the deficiency of labeled data. The ViT output was further processed by a standard radial symmetry (RS) method for subpixel localization. Our method performed well on model-unseen public datasets: one in silico dataset with ground truth (GT) and four in vivo datasets of mouse tumor, rat brain, rat brain bolus, and rat kidney. Our pipeline outperformed conventional ULM, achieving higher positive predictive values (precision in DL, 0.88–0.41 versus 0.83–0.16) and improved accuracy (root-mean-square errors (RMSEs): 0.25–

$0.14~\lambda $

versus 0.31–

$0.13~\lambda $

) across a range of signal-to-noise ratios (SNRs) from 60 to 10 dB. Our model could detect more vessels in diverse in vivo datasets while achieving comparable resolutions to the standard method. The proposed ViT-based model, seamlessly integrated with state-of-the-art downstream ULM steps, improved the overall ULM performance with no priors.

查看原文本刊更多论文

基于视觉变压器的超声定位显微微泡自动识别。

超声定位显微镜（ULM）突破了声学衍射极限，彻底改变了微血管成像。然而，不同的ULM工作流程在很大程度上依赖于不同的先验知识，例如脉冲响应和参数的经验选择（例如，每帧M的微泡数量（mb）），或者基于深度学习（DL）的研究中训练测试数据集的一致性。我们在此提出一个通用的ULM管道，减少先验。我们的方法利用DL模型，该模型同时提取微泡信号并减少每帧的斑点，而无需估计脉冲响应和m。我们的方法具有有效的通道注意力视觉转换器（ViT）和渐进式学习策略，使其能够通过逐渐增加的斑块大小的训练来学习全局信息。利用k-Wave工具箱生成了大量的合成数据来模拟各种MB模式，从而克服了标记数据的不足。采用标准径向对称方法对ViT输出进行亚像素定位。我们的方法在模型不可见的公共数据集上表现良好：一个具有基本事实的计算机数据集和四个小鼠肿瘤、大鼠脑、大鼠脑丸和大鼠肾的体内数据集。我们的管道优于传统的ULM，在60 dB到10 dB的信噪比范围内，实现了更高的阳性预测值（DL精度，0.88-0.41 vs 0.83-0.16）和更高的精度（均方根误差：0.25-0.14 λ vs 0.31-0.13 λ）。我们的模型可以在不同的体内数据集中检测到更多的血管，同时达到与标准方法相当的分辨率。所提出的基于vit的模型与最先进的下游ULM步骤无缝集成，在没有先验的情况下提高了ULM的整体性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on ultrasonics, ferroelectrics, and frequency control 工程技术-工程：电子与电气

CiteScore

7.70

自引率

16.70%

发文量

583

审稿时长

4.5 months

期刊介绍： IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control includes the theory, technology, materials, and applications relating to: (1) the generation, transmission, and detection of ultrasonic waves and related phenomena; (2) medical ultrasound, including hyperthermia, bioeffects, tissue characterization and imaging; (3) ferroelectric, piezoelectric, and piezomagnetic materials, including crystals, polycrystalline solids, films, polymers, and composites; (4) frequency control, timing and time distribution, including crystal oscillators and other means of classical frequency control, and atomic, molecular and laser frequency control standards. Areas of interest range from fundamental studies to the design and/or applications of devices and systems.