Main challenges on the curation of large scale datasets for pancreas segmentation using deep learning in multi-phase CT scans: Focus on cardinality, manual refinement, and annotation quality

IF 5.4 2区医学 Q1 ENGINEERING, BIOMEDICAL

Computerized Medical Imaging and Graphics Pub Date : 2024-09-13 DOI:10.1016/j.compmedimag.2024.102434

Matteo Cavicchioli , Andrea Moglia , Ludovica Pierelli , Giacomo Pugliese , Pietro Cerveri

{"title":"Main challenges on the curation of large scale datasets for pancreas segmentation using deep learning in multi-phase CT scans: Focus on cardinality, manual refinement, and annotation quality","authors":"Matteo Cavicchioli , Andrea Moglia , Ludovica Pierelli , Giacomo Pugliese , Pietro Cerveri","doi":"10.1016/j.compmedimag.2024.102434","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate segmentation of the pancreas in computed tomography (CT) holds paramount importance in diagnostics, surgical planning, and interventions. Recent studies have proposed supervised deep-learning models for segmentation, but their efficacy relies on the quality and quantity of the training data. Most of such works employed small-scale public datasets, without proving the efficacy of generalization to external datasets. This study explored the optimization of pancreas segmentation accuracy by pinpointing the ideal dataset size, understanding resource implications, examining manual refinement impact, and assessing the influence of anatomical subregions. We present the AIMS-1300 dataset encompassing 1,300 CT scans. Its manual annotation by medical experts required 938 h. A 2.5D UNet was implemented to assess the impact of training sample size on segmentation accuracy by partitioning the original AIMS-1300 dataset into 11 smaller subsets of progressively increasing numerosity. The findings revealed that training sets exceeding 440 CTs did not lead to better segmentation performance. In contrast, nnU-Net and UNet with Attention Gate reached a plateau for 585 CTs. Tests on generalization on the publicly available AMOS-CT dataset confirmed this outcome. As the size of the partition of the AIMS-1300 training set increases, the number of error slices decreases, reaching a minimum with 730 and 440 CTs, for AIMS-1300 and AMOS-CT datasets, respectively. Segmentation metrics on the AIMS-1300 and AMOS-CT datasets improved more on the head than the body and tail of the pancreas as the dataset size increased. By carefully considering the task and the characteristics of the available data, researchers can develop deep learning models without sacrificing performance even with limited data. This could accelerate developing and deploying artificial intelligence tools for pancreas surgery and other surgical data science applications.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"117 ","pages":"Article 102434"},"PeriodicalIF":5.4000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124001113/pdfft?md5=45c57efca48ce73af49c585837730bf7&pid=1-s2.0-S0895611124001113-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611124001113","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Accurate segmentation of the pancreas in computed tomography (CT) holds paramount importance in diagnostics, surgical planning, and interventions. Recent studies have proposed supervised deep-learning models for segmentation, but their efficacy relies on the quality and quantity of the training data. Most of such works employed small-scale public datasets, without proving the efficacy of generalization to external datasets. This study explored the optimization of pancreas segmentation accuracy by pinpointing the ideal dataset size, understanding resource implications, examining manual refinement impact, and assessing the influence of anatomical subregions. We present the AIMS-1300 dataset encompassing 1,300 CT scans. Its manual annotation by medical experts required 938 h. A 2.5D UNet was implemented to assess the impact of training sample size on segmentation accuracy by partitioning the original AIMS-1300 dataset into 11 smaller subsets of progressively increasing numerosity. The findings revealed that training sets exceeding 440 CTs did not lead to better segmentation performance. In contrast, nnU-Net and UNet with Attention Gate reached a plateau for 585 CTs. Tests on generalization on the publicly available AMOS-CT dataset confirmed this outcome. As the size of the partition of the AIMS-1300 training set increases, the number of error slices decreases, reaching a minimum with 730 and 440 CTs, for AIMS-1300 and AMOS-CT datasets, respectively. Segmentation metrics on the AIMS-1300 and AMOS-CT datasets improved more on the head than the body and tail of the pancreas as the dataset size increased. By carefully considering the task and the characteristics of the available data, researchers can develop deep learning models without sacrificing performance even with limited data. This could accelerate developing and deploying artificial intelligence tools for pancreas surgery and other surgical data science applications.

查看原文本刊更多论文

利用深度学习在多期 CT 扫描中进行胰腺分割的大型数据集整理工作面临的主要挑战：重点关注万有引力、人工完善和注释质量

计算机断层扫描（CT）中胰腺的精确分割在诊断、手术规划和干预中至关重要。最近的研究提出了用于分割的有监督深度学习模型，但其有效性取决于训练数据的质量和数量。大多数此类研究都采用了小规模的公共数据集，但并未证明其对外部数据集的泛化效果。本研究通过确定理想的数据集大小、了解资源影响、检查人工细化的影响以及评估解剖亚区的影响，来探索胰腺分割准确性的优化。我们展示的 AIMS-1300 数据集包含 1,300 张 CT 扫描图像。通过将原始 AIMS-1300 数据集划分为 11 个数量逐渐增加的较小子集，我们采用了 2.5D UNet 来评估训练样本数量对分割准确性的影响。结果显示，训练集超过 440 个 CT 并不能带来更好的分割性能。相比之下，nnU-Net 和 UNet with Attention Gate 在 585 CTs 时达到了最高点。在公开的 AMOS-CT 数据集上进行的泛化测试证实了这一结果。随着 AIMS-1300 训练集分区大小的增加，错误片段的数量也在减少，AIMS-1300 和 AMOS-CT 数据集分别在 730 和 440 CT 时达到最小值。随着数据集大小的增加，AIMS-1300 和 AMOS-CT 数据集上胰腺头部的分割指标比胰腺体部和尾部的改进更大。通过仔细考虑任务和可用数据的特征，研究人员可以开发出深度学习模型，即使数据有限，也不会牺牲性能。这将加速开发和部署用于胰腺手术和其他手术数据科学应用的人工智能工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computerized Medical Imaging and Graphics 医学-核医学

CiteScore

10.70

自引率

3.50%

发文量

审稿时长

26 days

期刊介绍： The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.