LGF-Net：用于甲状腺结节超声图像分类的多尺度特征融合网络

IF 2.2 4区医学 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING

Journal of Applied Clinical Medical Physics Pub Date : 2025-07-25 DOI:10.1002/acm2.70149

Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin

{"title":"LGF-Net：用于甲状腺结节超声图像分类的多尺度特征融合网络","authors":"Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin","doi":"10.1002/acm2.70149","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.</p>\n </section>\n </div>","PeriodicalId":14989,"journal":{"name":"Journal of Applied Clinical Medical Physics","volume":"26 8","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acm2.70149","citationCount":"0","resultStr":"{\"title\":\"LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification\",\"authors\":\"Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin\",\"doi\":\"10.1002/acm2.70149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.</p>\\n </section>\\n </div>\",\"PeriodicalId\":14989,\"journal\":{\"name\":\"Journal of Applied Clinical Medical Physics\",\"volume\":\"26 8\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acm2.70149\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Clinical Medical Physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://aapm.onlinelibrary.wiley.com/doi/10.1002/acm2.70149\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Clinical Medical Physics","FirstCategoryId":"3","ListUrlMain":"https://aapm.onlinelibrary.wiley.com/doi/10.1002/acm2.70149","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

摘要

背景甲状腺癌是临床上最常见的肿瘤之一，甲状腺结节超声图像的准确分类对计算机辅助诊断至关重要。基于卷积神经网络（CNN）或变压器的模型很难整合局部和全局特征，这影响了识别的准确性。我们的方法旨在同时捕获甲状腺结节诊断所需的关键局部细粒度特征和全局空间特征。适应甲状腺结节不规则形态，动态聚焦甲状腺结节关键像素级区域，提高了模型的识别精度和泛化能力。方法基于人类视觉诊断的双路径机制，提出了局部和全局特征融合网络（LGF-Net）的多尺度融合模型，该模型由两个分支组成：一个CNN分支和一个Transformer分支。CNN分支集成了小波变换和可变形卷积模块（WTDCM），增强了模型捕捉判别性局部特征和识别细粒度纹理的能力。通过在Transformer分支中引入模仿生物视觉的聚合注意（AA）机制，可以有效地捕获空间特征。然后利用自适应特征融合模块（FFM）对甲状腺结节的多尺度特征进行融合，进一步提高分类性能。我们在公共甲状腺结节分类数据集（TNCD）和私人临床数据集上评估了我们的模型，包括准确性、召回率、精度和f1评分。在TNCD上，模型的准确率分别为81.50%、79.51%、79.92%和79.70%。在私有数据集上，分别达到91.24%、88.90%、90.73%和89.73%。这些结果优于最先进的方法。我们还进行了消融研究和可视化分析，以验证模型的组成和可解释性。结论该方法提高了甲状腺结节识别的准确率，具有较强的推广能力和临床应用潜力，为临床医生的诊断提供了可解释性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification

查看原文本刊更多论文

LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification

Background

Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.

Purpose

Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.

Methods

The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.

Results

We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.

Conclusions

The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Applied Clinical Medical Physics 医学-核医学

CiteScore

3.60

自引率

19.00%

发文量

331

审稿时长

3 months

期刊介绍： Journal of Applied Clinical Medical Physics is an international Open Access publication dedicated to clinical medical physics. JACMP welcomes original contributions dealing with all aspects of medical physics from scientists working in the clinical medical physics around the world. JACMP accepts only online submission. JACMP will publish: -Original Contributions: Peer-reviewed, investigations that represent new and significant contributions to the field. Recommended word count: up to 7500. -Review Articles: Reviews of major areas or sub-areas in the field of clinical medical physics. These articles may be of any length and are peer reviewed. -Technical Notes: These should be no longer than 3000 words, including key references. -Letters to the Editor: Comments on papers published in JACMP or on any other matters of interest to clinical medical physics. These should not be more than 1250 (including the literature) and their publication is only based on the decision of the editor, who occasionally asks experts on the merit of the contents. -Book Reviews: The editorial office solicits Book Reviews. -Announcements of Forthcoming Meetings: The Editor may provide notice of forthcoming meetings, course offerings, and other events relevant to clinical medical physics. -Parallel Opposed Editorial: We welcome topics relevant to clinical practice and medical physics profession. The contents can be controversial debate or opposed aspects of an issue. One author argues for the position and the other against. Each side of the debate contains an opening statement up to 800 words, followed by a rebuttal up to 500 words. Readers interested in participating in this series should contact the moderator with a proposed title and a short description of the topic