Febrio Lunardo, Laura Baker, Alex Tan, John Baines, Timothy Squire, Jason A Dowling, Mostafa Rahimi Azghadi, Ashley G Gillman
{"title":"How much data do you need? An analysis of pelvic multi-organ segmentation in a limited data context.","authors":"Febrio Lunardo, Laura Baker, Alex Tan, John Baines, Timothy Squire, Jason A Dowling, Mostafa Rahimi Azghadi, Ashley G Gillman","doi":"10.1007/s13246-024-01514-w","DOIUrl":null,"url":null,"abstract":"<p><p>Training deep learning models generally requires large, costly datasets which can limit their application towards in-house segmentation tasks. This study investigates the trade-off in dataset size within the context of pelvic multi-organ MR segmentation where we evaluate the performance of nnU-Net, a well-known segmentation model, under conditions of limited domain and data availability. 12 participants undergoing treatment on an Elekta Unity were recruited, acquiring 58 MR images, with 4 participants (12 images) withheld for testing. Prostate, seminal vesicles (SV), bladder and rectum were contoured in each image by a radiation oncologist. Seven models were trained on progressively smaller subsets of the training dataset, simulating a limited dataset setting. To investigate the efficacy of data augmentation, another set of identical models were trained without augmentation. The performance of the networks was evaluated via the Dice Similarity Coefficient, mean surface distance, and 95% Hausdorff distance metrics. When trained with entire training dataset (46 images), the model achieved a mean Dice coefficient of 0.903 (Prostate), 0.851 (SV), 0.884 (Rectum) and 0.967 (Bladder). Segmentation performance remained stable when the number of training sets was > 12 images from 4 participants, but rapidly dropped in smaller data subsets. Data augmentation was found to be influential across all dataset sizes, but especially in very small datasets. This study demonstrated nnU-Net's proficiency in performing male pelvic multi-organ segmentation under a limited domain, a single scanner, and under limited data constraints. We found that the performance degradation was often modest until a threshold is reached (12 images), below which it dropped significantly. Data augmentation improved performance across all data sizes, but especially for very small datasets. We conclude that nnU-Net's low data requirement can be advantageous for in-house cases with consistent protocol and scarce data availability.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-024-01514-w","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Training deep learning models generally requires large, costly datasets which can limit their application towards in-house segmentation tasks. This study investigates the trade-off in dataset size within the context of pelvic multi-organ MR segmentation where we evaluate the performance of nnU-Net, a well-known segmentation model, under conditions of limited domain and data availability. 12 participants undergoing treatment on an Elekta Unity were recruited, acquiring 58 MR images, with 4 participants (12 images) withheld for testing. Prostate, seminal vesicles (SV), bladder and rectum were contoured in each image by a radiation oncologist. Seven models were trained on progressively smaller subsets of the training dataset, simulating a limited dataset setting. To investigate the efficacy of data augmentation, another set of identical models were trained without augmentation. The performance of the networks was evaluated via the Dice Similarity Coefficient, mean surface distance, and 95% Hausdorff distance metrics. When trained with entire training dataset (46 images), the model achieved a mean Dice coefficient of 0.903 (Prostate), 0.851 (SV), 0.884 (Rectum) and 0.967 (Bladder). Segmentation performance remained stable when the number of training sets was > 12 images from 4 participants, but rapidly dropped in smaller data subsets. Data augmentation was found to be influential across all dataset sizes, but especially in very small datasets. This study demonstrated nnU-Net's proficiency in performing male pelvic multi-organ segmentation under a limited domain, a single scanner, and under limited data constraints. We found that the performance degradation was often modest until a threshold is reached (12 images), below which it dropped significantly. Data augmentation improved performance across all data sizes, but especially for very small datasets. We conclude that nnU-Net's low data requirement can be advantageous for in-house cases with consistent protocol and scarce data availability.