甲状腺- lmd:长尾多标签甲状腺超声诊断的基准数据集和样本驱动数据加载、关注和正则化。

Jiansong Zhang, Shunlan Liu, Xiaoling Luo, Guorong Lyu, Linlin Shen
{"title":"甲状腺- lmd:长尾多标签甲状腺超声诊断的基准数据集和样本驱动数据加载、关注和正则化。","authors":"Jiansong Zhang, Shunlan Liu, Xiaoling Luo, Guorong Lyu, Linlin Shen","doi":"10.1109/TMI.2026.3690144","DOIUrl":null,"url":null,"abstract":"<p><p>Developing robust and effective computer-aided diagnostic (CAD) methods for thyroid ultrasound (TUS) remains a key challenge in medical imaging. Prior work has largely focused on binary or multi-class lesion classification, whereas real-world diagnosis follows standardized guidelines based on combinations of lexicon-level descriptors. These combinations naturally exhibit long-tailed distributions due to epidemiological patterns, limiting the robustness and generalizability of existing methods. Motivated by this, we introduce Thyro-LMD, the first long-tailed multi-label dataset for TUS. Using histopathology as the reference, Thyro-LMD provides retrospective, fine-grained annotations aligned with ACR TI-RADS lexicons and reveals a highly imbalanced label distribution. We benchmark representative methods, including end-to-end models, general-purpose multimodal large models (e.g., GPT-4o), and pretrained foundation models. While some methods show reasonable head-class performance, they struggle with body and tail classes. We therefore propose SynTUS-Net, a purpose-built baseline comprising collaborative modules addressing long-tailed multi-label challenges across data loading, feature encoding, and prediction regularization. SynTUS-Net achieves leading performance on Thyro-LMD, outperforming conventional traditional SOTA models by 5.3 Micro-F1 and 11.83 Macro-F1, and exceeding GPT-4o by 42.76 on Tail-F1. Extensive ablation studies confirm the contribution of each module. We believe Thyro-LMD and SynTUS-Net establish a clinically grounded benchmark and a new paradigm for interpretable and generalizable AI in ultrasound. Code and data will be released here.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Thyro-LMD: A Benchmark Dataset and Sample-Driven Data Loading, Attention, and Regularization for Long-Tailed Multi-Label Thyroid Ultrasound Diagnosis.\",\"authors\":\"Jiansong Zhang, Shunlan Liu, Xiaoling Luo, Guorong Lyu, Linlin Shen\",\"doi\":\"10.1109/TMI.2026.3690144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Developing robust and effective computer-aided diagnostic (CAD) methods for thyroid ultrasound (TUS) remains a key challenge in medical imaging. Prior work has largely focused on binary or multi-class lesion classification, whereas real-world diagnosis follows standardized guidelines based on combinations of lexicon-level descriptors. These combinations naturally exhibit long-tailed distributions due to epidemiological patterns, limiting the robustness and generalizability of existing methods. Motivated by this, we introduce Thyro-LMD, the first long-tailed multi-label dataset for TUS. Using histopathology as the reference, Thyro-LMD provides retrospective, fine-grained annotations aligned with ACR TI-RADS lexicons and reveals a highly imbalanced label distribution. We benchmark representative methods, including end-to-end models, general-purpose multimodal large models (e.g., GPT-4o), and pretrained foundation models. While some methods show reasonable head-class performance, they struggle with body and tail classes. We therefore propose SynTUS-Net, a purpose-built baseline comprising collaborative modules addressing long-tailed multi-label challenges across data loading, feature encoding, and prediction regularization. SynTUS-Net achieves leading performance on Thyro-LMD, outperforming conventional traditional SOTA models by 5.3 Micro-F1 and 11.83 Macro-F1, and exceeding GPT-4o by 42.76 on Tail-F1. Extensive ablation studies confirm the contribution of each module. We believe Thyro-LMD and SynTUS-Net establish a clinically grounded benchmark and a new paradigm for interpretable and generalizable AI in ultrasound. Code and data will be released here.</p>\",\"PeriodicalId\":94033,\"journal\":{\"name\":\"IEEE transactions on medical imaging\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2026-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on medical imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TMI.2026.3690144\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TMI.2026.3690144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

开发稳健有效的甲状腺超声计算机辅助诊断(CAD)方法仍然是医学影像学的一个关键挑战。先前的工作主要集中在二元或多类病变分类上,而现实世界的诊断遵循基于词典级描述符组合的标准化指南。由于流行病学模式,这些组合自然表现出长尾分布,限制了现有方法的稳健性和泛化性。基于此,我们引入了首个针对TUS的长尾多标签数据集——甲状腺- lmd。甲状腺- lmd以组织病理学为参考,提供了与ACR TI-RADS词典一致的回顾性、细粒度注释,并揭示了高度不平衡的标签分布。我们对代表性方法进行基准测试,包括端到端模型、通用多模态大型模型(例如,gpt - 40)和预训练的基础模型。虽然有些方法显示出合理的头部类性能,但它们在身体和尾部类中表现不佳。因此,我们提出了SynTUS-Net,这是一个由协作模块组成的专用基线,可解决跨数据加载、特征编码和预测正则化的长尾多标签挑战。SynTUS-Net在甲状腺- lmd上取得了领先的性能,比传统的SOTA模型高出5.3 Micro-F1和11.83 Macro-F1,在Tail-F1上超过gpt - 40 42.76。广泛的消融研究证实了每个模块的贡献。我们相信,thyroid - lmd和SynTUS-Net为超声中可解释和可推广的人工智能建立了临床基础基准和新范式。代码和数据将在这里发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Thyro-LMD: A Benchmark Dataset and Sample-Driven Data Loading, Attention, and Regularization for Long-Tailed Multi-Label Thyroid Ultrasound Diagnosis.

Developing robust and effective computer-aided diagnostic (CAD) methods for thyroid ultrasound (TUS) remains a key challenge in medical imaging. Prior work has largely focused on binary or multi-class lesion classification, whereas real-world diagnosis follows standardized guidelines based on combinations of lexicon-level descriptors. These combinations naturally exhibit long-tailed distributions due to epidemiological patterns, limiting the robustness and generalizability of existing methods. Motivated by this, we introduce Thyro-LMD, the first long-tailed multi-label dataset for TUS. Using histopathology as the reference, Thyro-LMD provides retrospective, fine-grained annotations aligned with ACR TI-RADS lexicons and reveals a highly imbalanced label distribution. We benchmark representative methods, including end-to-end models, general-purpose multimodal large models (e.g., GPT-4o), and pretrained foundation models. While some methods show reasonable head-class performance, they struggle with body and tail classes. We therefore propose SynTUS-Net, a purpose-built baseline comprising collaborative modules addressing long-tailed multi-label challenges across data loading, feature encoding, and prediction regularization. SynTUS-Net achieves leading performance on Thyro-LMD, outperforming conventional traditional SOTA models by 5.3 Micro-F1 and 11.83 Macro-F1, and exceeding GPT-4o by 42.76 on Tail-F1. Extensive ablation studies confirm the contribution of each module. We believe Thyro-LMD and SynTUS-Net establish a clinically grounded benchmark and a new paradigm for interpretable and generalizable AI in ultrasound. Code and data will be released here.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书