In Silico Digital Breast Tomosynthesis Dataset for the Comparative Analysis of Deep Learning Models in Tumor Segmentation.

Cristina Alfaro Vergara, Nicolás Araya Caro, Domingo Mery Quiroz, Claudia Prieto Vasquez
{"title":"In Silico Digital Breast Tomosynthesis Dataset for the Comparative Analysis of Deep Learning Models in Tumor Segmentation.","authors":"Cristina Alfaro Vergara, Nicolás Araya Caro, Domingo Mery Quiroz, Claudia Prieto Vasquez","doi":"10.1007/s10278-025-01626-z","DOIUrl":null,"url":null,"abstract":"<p><p>The scarcity of publicly available digital breast tomosynthesis (DBT) datasets significantly limits the development of robust deep learning (DL) models for breast tumor segmentation. In this exploratory proof-of-concept study, we assess the viability of in silico-generated DBT data as a training source for tumor segmentation. A dataset of 230 two-dimensional (2D) regions of interest (ROIs) derived from FDA-cleared software and encompassing a spectrum of breast densities and tumor complexities, was used to train 13 DL models, including U-Net, FCN, DeepLabv3, and DeepLabv3 + architectures. Each model was trained either from scratch or fine-tuned using COCO-pretrained weights (ResNet50/101 backbones). Performance was evaluated using F1-score, intersection over union (IoU), precision, and recall. Among all models, U-Net trained from scratch and DeepLabv3 + fine-tuned with ResNet50 achieved the highest and most consistent results (F1-scores of 82.52% and 84.98%, and per-image IoUs of 78.49% and 83.77%, respectively). No statistically significant differences were found using the Wilcoxon signed-rank test and post hoc Bonferroni correction (α > 0.0042). To evaluate generalization across domains, the baseline U-Net model was retrained from scratch on a hybrid dataset combining in silico and real-world DBT ROIs, yielding promising results (F1-score of 79%). Despite the domain shift, these findings support the utility of in silico DBT as a complementary resource for training and benchmarking DL models, particularly in data-limited environments. This study provides foundational experimental evidence for integrating computationally generated in silico data into AI-based DBT tumor segmentation research workflows.</p>","PeriodicalId":516858,"journal":{"name":"Journal of imaging informatics in medicine","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of imaging informatics in medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10278-025-01626-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The scarcity of publicly available digital breast tomosynthesis (DBT) datasets significantly limits the development of robust deep learning (DL) models for breast tumor segmentation. In this exploratory proof-of-concept study, we assess the viability of in silico-generated DBT data as a training source for tumor segmentation. A dataset of 230 two-dimensional (2D) regions of interest (ROIs) derived from FDA-cleared software and encompassing a spectrum of breast densities and tumor complexities, was used to train 13 DL models, including U-Net, FCN, DeepLabv3, and DeepLabv3 + architectures. Each model was trained either from scratch or fine-tuned using COCO-pretrained weights (ResNet50/101 backbones). Performance was evaluated using F1-score, intersection over union (IoU), precision, and recall. Among all models, U-Net trained from scratch and DeepLabv3 + fine-tuned with ResNet50 achieved the highest and most consistent results (F1-scores of 82.52% and 84.98%, and per-image IoUs of 78.49% and 83.77%, respectively). No statistically significant differences were found using the Wilcoxon signed-rank test and post hoc Bonferroni correction (α > 0.0042). To evaluate generalization across domains, the baseline U-Net model was retrained from scratch on a hybrid dataset combining in silico and real-world DBT ROIs, yielding promising results (F1-score of 79%). Despite the domain shift, these findings support the utility of in silico DBT as a complementary resource for training and benchmarking DL models, particularly in data-limited environments. This study provides foundational experimental evidence for integrating computationally generated in silico data into AI-based DBT tumor segmentation research workflows.

基于数字乳腺断层合成数据集的肿瘤分割深度学习模型对比分析。
公开可用的数字乳腺断层合成(DBT)数据集的稀缺性极大地限制了用于乳腺肿瘤分割的鲁棒深度学习(DL)模型的发展。在这个探索性的概念验证研究中,我们评估了在硅中生成的DBT数据作为肿瘤分割的训练源的可行性。由230个二维感兴趣区域(roi)组成的数据集来自fda批准的软件,包含乳腺密度和肿瘤复杂性的谱,用于训练13个深度学习模型,包括U-Net、FCN、DeepLabv3和DeepLabv3 +架构。每个模型要么从零开始训练,要么使用coco预训练的权重(ResNet50/101主干)进行微调。性能评估使用f1评分,交叉优于联合(IoU),精度和召回率。在所有模型中,从头开始训练的U-Net和使用ResNet50进行微调的DeepLabv3 +获得了最高和最一致的结果(f1得分分别为82.52%和84.98%,每张图像IoUs分别为78.49%和83.77%)。经Wilcoxon符号秩检验和事后Bonferroni校正(α > 0.0042),两组间差异无统计学意义。为了评估跨域的泛化,基线U-Net模型在结合了计算机和现实DBT roi的混合数据集上从零开始重新训练,产生了有希望的结果(f1得分为79%)。尽管领域发生了变化,但这些发现支持了DBT作为DL模型训练和基准测试的补充资源的实用性,特别是在数据有限的环境中。本研究为将计算机生成的数据集成到基于人工智能的DBT肿瘤分割研究工作流程中提供了基础实验证据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信