Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin
{"title":"LGF-Net:用于甲状腺结节超声图像分类的多尺度特征融合网络","authors":"Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin","doi":"10.1002/acm2.70149","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.</p>\n </section>\n </div>","PeriodicalId":14989,"journal":{"name":"Journal of Applied Clinical Medical Physics","volume":"26 8","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acm2.70149","citationCount":"0","resultStr":"{\"title\":\"LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification\",\"authors\":\"Yao Xiao, Yan Zhuang, Wenwu Ling, Shouyu Jiang, Ke Chen, Guoliang Liao, Yuhua Xie, Yao Hou, Lin Han, Zhan Hua, Yan Luo, Jiangli Lin\",\"doi\":\"10.1002/acm2.70149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3>\\n \\n <p>Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.</p>\\n </section>\\n </div>\",\"PeriodicalId\":14989,\"journal\":{\"name\":\"Journal of Applied Clinical Medical Physics\",\"volume\":\"26 8\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/acm2.70149\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Clinical Medical Physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/acm2.70149\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Clinical Medical Physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/acm2.70149","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
LGF-Net: A multi-scale feature fusion network for thyroid nodule ultrasound image classification
Background
Thyroid cancer is one of the most common cancers in clinical practice, and accurate classification of thyroid nodule ultrasound images is crucial for computer-aided diagnosis. Models based on a convolutional neural network (CNN) or a transformer struggle to integrate local and global features, which impacts the recognition accuracy.
Purpose
Our method is designed to capture both the key local fine-grained features and the global spatial features essential for thyroid nodule diagnosis simultaneously. It adapts to the irregular morphology of thyroid nodules, dynamically focuses on the key pixel-level regions of thyroid nodules, and thereby improves the model's recognition accuracy and generalization ability.
Methods
The proposed multi-scale fusion model, the local and global feature fusion network (LGF-Net), inspired by the dual-path mechanism of human visual diagnosis, consists of two branches: a CNN branch and a Transformer branch. The CNN branch integrates the wavelet transform and deformable convolution module (WTDCM) to enhance the model's ability to capture discriminative local features and recognize fine-grained textures. By introducing the aggregated attention (AA) mechanism, which mimics biological vision, into the Transformer branch, spatial features are effectively captured. The adaptive feature fusion module (FFM) is then utilized to integrate the multi-scale features of thyroid nodules, further improving classification performance.
Results
We evaluated our model on the public thyroid nodule classification dataset (TNCD) and a private clinical dataset using accuracy, recall, precision, and F1-score. On TNCD, the model achieved 81.50%, 79.51%, 79.92%, and 79.70%, respectively. On the private dataset, it reached 91.24%, 88.90%, 90.73%, and 89.73%, respectively. These results outperformed state-of-the-art methods. We also conducted ablation studies and visualization analysis to validate the model's components and interpretability.
Conclusions
The experiments demonstrate that our method improves the accuracy of thyroid nodule recognition, shows its strong generalization ability and potential for clinical application, and provides interpretability for clinicians' diagnoses.
期刊介绍:
Journal of Applied Clinical Medical Physics is an international Open Access publication dedicated to clinical medical physics. JACMP welcomes original contributions dealing with all aspects of medical physics from scientists working in the clinical medical physics around the world. JACMP accepts only online submission.
JACMP will publish:
-Original Contributions: Peer-reviewed, investigations that represent new and significant contributions to the field. Recommended word count: up to 7500.
-Review Articles: Reviews of major areas or sub-areas in the field of clinical medical physics. These articles may be of any length and are peer reviewed.
-Technical Notes: These should be no longer than 3000 words, including key references.
-Letters to the Editor: Comments on papers published in JACMP or on any other matters of interest to clinical medical physics. These should not be more than 1250 (including the literature) and their publication is only based on the decision of the editor, who occasionally asks experts on the merit of the contents.
-Book Reviews: The editorial office solicits Book Reviews.
-Announcements of Forthcoming Meetings: The Editor may provide notice of forthcoming meetings, course offerings, and other events relevant to clinical medical physics.
-Parallel Opposed Editorial: We welcome topics relevant to clinical practice and medical physics profession. The contents can be controversial debate or opposed aspects of an issue. One author argues for the position and the other against. Each side of the debate contains an opening statement up to 800 words, followed by a rebuttal up to 500 words. Readers interested in participating in this series should contact the moderator with a proposed title and a short description of the topic