基于b超和Nakagami图像的多模态深度学习诊断乳腺肿瘤。

IF 1.7 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Journal of Medical Imaging Pub Date : 2025-11-01 Epub Date: 2025-05-14 DOI:10.1117/1.JMI.12.S2.S22009

Sabiq Muhtadi, Caterina M Gallippi

{"title":"基于b超和Nakagami图像的多模态深度学习诊断乳腺肿瘤。","authors":"Sabiq Muhtadi, Caterina M Gallippi","doi":"10.1117/1.JMI.12.S2.S22009","DOIUrl":null,"url":null,"abstract":"Purpose: We propose and evaluate multimodal deep learning (DL) approaches that combine ultrasound (US) B-mode and Nakagami parametric images for breast tumor classification. It is hypothesized that integrating tissue brightness information from B-mode images with scattering properties from Nakagami images will enhance diagnostic performance compared with single-input approaches.Approach: An EfficientNetV2B0 network was used to develop multimodal DL frameworks that took as input (i) numerical two-dimensional (2D) maps or (ii) rendered red-green-blue (RGB) representations of both B-mode and Nakagami data. The diagnostic performance of these frameworks was compared with single-input counterparts using 831 US acquisitions from 264 patients. In addition, gradient-weighted class activation mapping was applied to evaluate diagnostically relevant information utilized by the different networks.Results: The multimodal architectures demonstrated significantly higher area under the receiver operating characteristic curve (AUC) values ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ) than their monomodal counterparts, achieving an average improvement of 10.75%. In addition, the multimodal networks incorporated, on average, 15.70% more diagnostically relevant tissue information. Among the multimodal models, those using RGB representations as input outperformed those that utilized 2D numerical data maps ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ). The top-performing multimodal architecture achieved a mean AUC of 0.896 [95% confidence interval (CI): 0.813 to 0.959] when performance was assessed at the image level and 0.848 (95% CI: 0.755 to 0.903) when assessed at the lesion level.Conclusions: Incorporating B-mode and Nakagami information together in a multimodal DL framework improved classification outcomes and increased the amount of diagnostically relevant information accessed by networks, highlighting the potential for automating and standardizing US breast cancer diagnostics to enhance clinical outcomes.","PeriodicalId":47707,"journal":{"name":"Journal of Medical Imaging","volume":"12 Suppl 2","pages":"S22009"},"PeriodicalIF":1.7000,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077846/pdf/","citationCount":"0","resultStr":"{\"title\":\"Breast tumor diagnosis via multimodal deep learning using ultrasound B-mode and Nakagami images.\",\"authors\":\"Sabiq Muhtadi, Caterina M Gallippi\",\"doi\":\"10.1117/1.JMI.12.S2.S22009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: We propose and evaluate multimodal deep learning (DL) approaches that combine ultrasound (US) B-mode and Nakagami parametric images for breast tumor classification. It is hypothesized that integrating tissue brightness information from B-mode images with scattering properties from Nakagami images will enhance diagnostic performance compared with single-input approaches.Approach: An EfficientNetV2B0 network was used to develop multimodal DL frameworks that took as input (i) numerical two-dimensional (2D) maps or (ii) rendered red-green-blue (RGB) representations of both B-mode and Nakagami data. The diagnostic performance of these frameworks was compared with single-input counterparts using 831 US acquisitions from 264 patients. In addition, gradient-weighted class activation mapping was applied to evaluate diagnostically relevant information utilized by the different networks.Results: The multimodal architectures demonstrated significantly higher area under the receiver operating characteristic curve (AUC) values ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ) than their monomodal counterparts, achieving an average improvement of 10.75%. In addition, the multimodal networks incorporated, on average, 15.70% more diagnostically relevant tissue information. Among the multimodal models, those using RGB representations as input outperformed those that utilized 2D numerical data maps ( <math><mrow><mi>p</mi> <mo><</mo> <mn>0.05</mn></mrow> </math> ). The top-performing multimodal architecture achieved a mean AUC of 0.896 [95% confidence interval (CI): 0.813 to 0.959] when performance was assessed at the image level and 0.848 (95% CI: 0.755 to 0.903) when assessed at the lesion level.Conclusions: Incorporating B-mode and Nakagami information together in a multimodal DL framework improved classification outcomes and increased the amount of diagnostically relevant information accessed by networks, highlighting the potential for automating and standardizing US breast cancer diagnostics to enhance clinical outcomes.\",\"PeriodicalId\":47707,\"journal\":{\"name\":\"Journal of Medical Imaging\",\"volume\":\"12 Suppl 2\",\"pages\":\"S22009\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12077846/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Imaging\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1117/1.JMI.12.S2.S22009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/5/14 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Imaging","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1117/1.JMI.12.S2.S22009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/14 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

目的：我们提出并评估结合超声（US） B-mode和Nakagami参数图像的多模态深度学习（DL）方法用于乳腺肿瘤分类。假设将来自b模式图像的组织亮度信息与来自Nakagami图像的散射特性相结合，与单一输入方法相比，将提高诊断性能。方法：使用EfficientNetV2B0网络开发多模式深度学习框架，该框架将(i)数值二维（2D）地图或（ii） b模式和Nakagami数据的红绿蓝（RGB）表示作为输入。使用来自264名患者的831份美国病历，将这些框架的诊断性能与单输入对照进行比较。此外，应用梯度加权类激活映射来评估不同网络利用的诊断相关信息。结果：与单模结构相比，多模结构的受者工作特征曲线（AUC）值下面积显著增加（p 0.05），平均改善10.75%。此外，多模式网络平均多纳入15.70%的诊断相关组织信息。在多模态模型中，使用RGB表示作为输入的模型优于使用2D数值数据图的模型（p 0.05）。表现最好的多模式架构在图像水平评估时的平均AUC为0.896[95%置信区间（CI）： 0.813至0.959]，在病变水平评估时的平均AUC为0.848 （95% CI: 0.755至0.903）。结论：将b模式和Nakagami信息结合在一个多模式DL框架中，可以改善分类结果，增加网络访问的诊断相关信息的数量，突出了美国乳腺癌诊断自动化和标准化的潜力，以提高临床结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Breast tumor diagnosis via multimodal deep learning using ultrasound B-mode and Nakagami images.

Purpose: We propose and evaluate multimodal deep learning (DL) approaches that combine ultrasound (US) B-mode and Nakagami parametric images for breast tumor classification. It is hypothesized that integrating tissue brightness information from B-mode images with scattering properties from Nakagami images will enhance diagnostic performance compared with single-input approaches.

Approach: An EfficientNetV2B0 network was used to develop multimodal DL frameworks that took as input (i) numerical two-dimensional (2D) maps or (ii) rendered red-green-blue (RGB) representations of both B-mode and Nakagami data. The diagnostic performance of these frameworks was compared with single-input counterparts using 831 US acquisitions from 264 patients. In addition, gradient-weighted class activation mapping was applied to evaluate diagnostically relevant information utilized by the different networks.

Results: The multimodal architectures demonstrated significantly higher area under the receiver operating characteristic curve (AUC) values ( $p < 0.05$ ) than their monomodal counterparts, achieving an average improvement of 10.75%. In addition, the multimodal networks incorporated, on average, 15.70% more diagnostically relevant tissue information. Among the multimodal models, those using RGB representations as input outperformed those that utilized 2D numerical data maps ( $p < 0.05$ ). The top-performing multimodal architecture achieved a mean AUC of 0.896 [95% confidence interval (CI): 0.813 to 0.959] when performance was assessed at the image level and 0.848 (95% CI: 0.755 to 0.903) when assessed at the lesion level.

Conclusions: Incorporating B-mode and Nakagami information together in a multimodal DL framework improved classification outcomes and increased the amount of diagnostically relevant information accessed by networks, highlighting the potential for automating and standardizing US breast cancer diagnostics to enhance clinical outcomes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Imaging RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

4.10

自引率

4.20%

发文量

期刊介绍： JMI covers fundamental and translational research, as well as applications, focused on medical imaging, which continue to yield physical and biomedical advancements in the early detection, diagnostics, and therapy of disease as well as in the understanding of normal. The scope of JMI includes: Imaging physics, Tomographic reconstruction algorithms (such as those in CT and MRI), Image processing and deep learning, Computer-aided diagnosis and quantitative image analysis, Visualization and modeling, Picture archiving and communications systems (PACS), Image perception and observer performance, Technology assessment, Ultrasonic imaging, Image-guided procedures, Digital pathology, Biomedical applications of biomedical imaging. JMI allows for the peer-reviewed communication and archiving of scientific developments, translational and clinical applications, reviews, and recommendations for the field.