Baohua Wang, Yaqian Wang, Yanhua Chu, Ke Zhang, Lei Liu, Kexin Zhang, Bowen Zhu, Dong Wang, Tianan Jiang
{"title":"AbVLM-Q: intelligent quality assessment for abdominal ultrasound standard planes via vision-language modeling.","authors":"Baohua Wang, Yaqian Wang, Yanhua Chu, Ke Zhang, Lei Liu, Kexin Zhang, Bowen Zhu, Dong Wang, Tianan Jiang","doi":"10.1186/s12880-025-01885-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Abdominal ultrasound is non-invasive and efficient, yet acquiring standard planes remains challenging due to operator dependency and procedural complexity. We propose AbVLM-Q, a vision-language framework for automated quality assessment of abdominal ultrasound standard planes.</p><p><strong>Methods: </strong>In this study, we assembled a multi-center dataset comprising 7,766 abdominal ultrasound scans, which were randomly divided into training (70%), validation (15%), and testing (15%) subsets. The proposed method, AbVLM-Q, was developed using a three-step approach: (1) hierarchical prompting that incorporates spatially aware querying and sequential reasoning; (2) a quantifiable scoring mechanism based on multi-level clinical penalty criteria; and (3) LoRA (Low-Rank Adaptation)-based fine-tuning of a pretrained vision-language model. Performance was evaluated using mean recall, precision, label accuracy, subset accuracy, and confusion matrix analysis.</p><p><strong>Results: </strong>The system achieved key structure detection with 88.90% mean recall and 98.10% precision, showing higher precision and comparable recall to Faster R-CNN (89.77% recall, 88.64% precision at a 0.5 confidence threshold). Plane classification yielded 98.96% label accuracy and 96.28% subset accuracy, surpassing the best CNN (97.84%, 94.29%; P < 0.05). Image scoring accuracy for the clinically critical \"Excellent\" grade (scores 8-10) reached 85.11% with the best-performing backbone. Confusion matrix analysis confirmed consistent performance across different backbones, with discrepancies primarily observed at grade boundaries.</p><p><strong>Conclusions: </strong>AbVLM-Q provides a novel method for automated ultrasound quality assessment, functioning as both an evaluation tool and a training platform for standardized scanning. It bridges AI-driven imaging analysis with clinical workflows, enhancing quality control in ultrasound diagnostics.</p>","PeriodicalId":9020,"journal":{"name":"BMC Medical Imaging","volume":"25 1","pages":"344"},"PeriodicalIF":3.2000,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12374393/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12880-025-01885-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Abdominal ultrasound is non-invasive and efficient, yet acquiring standard planes remains challenging due to operator dependency and procedural complexity. We propose AbVLM-Q, a vision-language framework for automated quality assessment of abdominal ultrasound standard planes.
Methods: In this study, we assembled a multi-center dataset comprising 7,766 abdominal ultrasound scans, which were randomly divided into training (70%), validation (15%), and testing (15%) subsets. The proposed method, AbVLM-Q, was developed using a three-step approach: (1) hierarchical prompting that incorporates spatially aware querying and sequential reasoning; (2) a quantifiable scoring mechanism based on multi-level clinical penalty criteria; and (3) LoRA (Low-Rank Adaptation)-based fine-tuning of a pretrained vision-language model. Performance was evaluated using mean recall, precision, label accuracy, subset accuracy, and confusion matrix analysis.
Results: The system achieved key structure detection with 88.90% mean recall and 98.10% precision, showing higher precision and comparable recall to Faster R-CNN (89.77% recall, 88.64% precision at a 0.5 confidence threshold). Plane classification yielded 98.96% label accuracy and 96.28% subset accuracy, surpassing the best CNN (97.84%, 94.29%; P < 0.05). Image scoring accuracy for the clinically critical "Excellent" grade (scores 8-10) reached 85.11% with the best-performing backbone. Confusion matrix analysis confirmed consistent performance across different backbones, with discrepancies primarily observed at grade boundaries.
Conclusions: AbVLM-Q provides a novel method for automated ultrasound quality assessment, functioning as both an evaluation tool and a training platform for standardized scanning. It bridges AI-driven imaging analysis with clinical workflows, enhancing quality control in ultrasound diagnostics.
期刊介绍:
BMC Medical Imaging is an open access journal publishing original peer-reviewed research articles in the development, evaluation, and use of imaging techniques and image processing tools to diagnose and manage disease.