AbVLM-Q: intelligent quality assessment for abdominal ultrasound standard planes via vision-language modeling.

IF 3.2 3区医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

BMC Medical Imaging Pub Date : 2025-08-23 DOI:10.1186/s12880-025-01885-w

Baohua Wang, Yaqian Wang, Yanhua Chu, Ke Zhang, Lei Liu, Kexin Zhang, Bowen Zhu, Dong Wang, Tianan Jiang

{"title":"AbVLM-Q: intelligent quality assessment for abdominal ultrasound standard planes via vision-language modeling.","authors":"Baohua Wang, Yaqian Wang, Yanhua Chu, Ke Zhang, Lei Liu, Kexin Zhang, Bowen Zhu, Dong Wang, Tianan Jiang","doi":"10.1186/s12880-025-01885-w","DOIUrl":null,"url":null,"abstract":"Background: Abdominal ultrasound is non-invasive and efficient, yet acquiring standard planes remains challenging due to operator dependency and procedural complexity. We propose AbVLM-Q, a vision-language framework for automated quality assessment of abdominal ultrasound standard planes.Methods: In this study, we assembled a multi-center dataset comprising 7,766 abdominal ultrasound scans, which were randomly divided into training (70%), validation (15%), and testing (15%) subsets. The proposed method, AbVLM-Q, was developed using a three-step approach: (1) hierarchical prompting that incorporates spatially aware querying and sequential reasoning; (2) a quantifiable scoring mechanism based on multi-level clinical penalty criteria; and (3) LoRA (Low-Rank Adaptation)-based fine-tuning of a pretrained vision-language model. Performance was evaluated using mean recall, precision, label accuracy, subset accuracy, and confusion matrix analysis.Results: The system achieved key structure detection with 88.90% mean recall and 98.10% precision, showing higher precision and comparable recall to Faster R-CNN (89.77% recall, 88.64% precision at a 0.5 confidence threshold). Plane classification yielded 98.96% label accuracy and 96.28% subset accuracy, surpassing the best CNN (97.84%, 94.29%; P < 0.05). Image scoring accuracy for the clinically critical \"Excellent\" grade (scores 8-10) reached 85.11% with the best-performing backbone. Confusion matrix analysis confirmed consistent performance across different backbones, with discrepancies primarily observed at grade boundaries.Conclusions: AbVLM-Q provides a novel method for automated ultrasound quality assessment, functioning as both an evaluation tool and a training platform for standardized scanning. It bridges AI-driven imaging analysis with clinical workflows, enhancing quality control in ultrasound diagnostics.","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"344"},"PeriodicalIF":3.2000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12374393/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01885-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Abdominal ultrasound is non-invasive and efficient, yet acquiring standard planes remains challenging due to operator dependency and procedural complexity. We propose AbVLM-Q, a vision-language framework for automated quality assessment of abdominal ultrasound standard planes.

Methods: In this study, we assembled a multi-center dataset comprising 7,766 abdominal ultrasound scans, which were randomly divided into training (70%), validation (15%), and testing (15%) subsets. The proposed method, AbVLM-Q, was developed using a three-step approach: (1) hierarchical prompting that incorporates spatially aware querying and sequential reasoning; (2) a quantifiable scoring mechanism based on multi-level clinical penalty criteria; and (3) LoRA (Low-Rank Adaptation)-based fine-tuning of a pretrained vision-language model. Performance was evaluated using mean recall, precision, label accuracy, subset accuracy, and confusion matrix analysis.

Results: The system achieved key structure detection with 88.90% mean recall and 98.10% precision, showing higher precision and comparable recall to Faster R-CNN (89.77% recall, 88.64% precision at a 0.5 confidence threshold). Plane classification yielded 98.96% label accuracy and 96.28% subset accuracy, surpassing the best CNN (97.84%, 94.29%; P < 0.05). Image scoring accuracy for the clinically critical "Excellent" grade (scores 8-10) reached 85.11% with the best-performing backbone. Confusion matrix analysis confirmed consistent performance across different backbones, with discrepancies primarily observed at grade boundaries.

Conclusions: AbVLM-Q provides a novel method for automated ultrasound quality assessment, functioning as both an evaluation tool and a training platform for standardized scanning. It bridges AI-driven imaging analysis with clinical workflows, enhancing quality control in ultrasound diagnostics.

Abstract Image

查看原文本刊更多论文

AbVLM-Q：基于视觉语言建模的腹部超声标准平面质量智能评估。

背景：腹部超声是非侵入性和高效的，但由于操作者的依赖性和程序的复杂性，获取标准平面仍然具有挑战性。我们提出AbVLM-Q，一个用于腹部超声标准平面自动质量评估的视觉语言框架。方法：在本研究中，我们组装了一个包含7,766个腹部超声扫描的多中心数据集，随机分为训练（70%）、验证（15%）和测试（15%）三个子集。提出的AbVLM-Q方法采用三步方法开发：(1)结合空间感知查询和顺序推理的分层提示；(2)基于多层次临床处罚标准的可量化评分机制；(3)基于LoRA （Low-Rank Adaptation）的预训练视觉语言模型微调。性能评估使用平均召回率，精度，标签准确性，子集准确性和混淆矩阵分析。结果：系统实现关键结构检测，平均查全率为88.90%，查全率为98.10%，查全率高于Faster R-CNN（查全率为89.77%，查全率为88.64%，置信阈值为0.5）。平面分类的标签准确率为98.96%，子集准确率为96.28%，超过了最好的CNN（97.84%, 94.29%）。结论：AbVLM-Q为自动化超声质量评估提供了一种新颖的方法，既可以作为评估工具，又可以作为标准化扫描的培训平台。它将人工智能驱动的成像分析与临床工作流程结合起来，增强了超声诊断的质量控制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.60

自引率

3.70%

发文量

198

审稿时长

27 weeks

期刊介绍： BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.