{"title":"MedLesSynth-LD: Lesion synthesis using physics-based noise models for robust lesion segmentation in low-data medical imaging regimes","authors":"Ramanujam Narayanan, Vaanathi Sundaresan","doi":"10.1016/j.patrec.2024.12.011","DOIUrl":null,"url":null,"abstract":"<div><div>Training models for robust lesion segmentation in medical imaging relies on the availability of sufficiently large pathological datasets and high-quality manual annotations. Hence, training such models is challenging in low-data regimes, even for localised lesions with defined boundaries, due to the lack of representation of variations in contrast, texture and sizes. In this work, we proposed a lesion simulation method, MedLesSynth-LD, to overcome the lack of diversity in localised lesion characteristics for training robust segmentation models. In MedLesSynth-LD, we used noise models inherently based on the physics involved in the acquisition of modalities to generate sufficiently realistic lesion textures by perturbing healthy tissues. Later, we localised these perturbations within masks defined by composites of ellipsoids (thus forming random shapes) and blended them with the input image with varying contrast. The lesion simulation step does not require training and can be tailored to generate defined, localised lesions to introduce sufficient variability (in size, shape, texture and contrast) in the training data pool. We evaluated the performance of a downstream lesion segmentation task using simulated lesionsfor multiple publicly available datasets across imaging modalities and organs: Brain MRI for tumour and white matter hyperintensity segmentation, liver CT for tumour segmentation, breast ultrasound for tumour segmentation, and retinal fundus imaging for exudate segmentation. Using only 75% of labelled real-world data, the proposed method significantly improved lesion segmentation compared to real data-based fully supervised training with an 16% mean increase in the Dice score (DSC) and 33% mean decrease in the normalised 95th percentile of the Hausdorff distance (HD95 (norm)). The proposed method also performed better than state-of-the-art lesion segmentation methods in low-data regimes, with an 10% higher mean DSC and a 19% mean decrease in HD95 (norm). The source code is available at <span><span>https://github.com/Ramanujam-N/MedLesSynth-LD</span><svg><path></path></svg></span> [commit SHA cc2b15b].</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"188 ","pages":"Pages 155-163"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524003672","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Training models for robust lesion segmentation in medical imaging relies on the availability of sufficiently large pathological datasets and high-quality manual annotations. Hence, training such models is challenging in low-data regimes, even for localised lesions with defined boundaries, due to the lack of representation of variations in contrast, texture and sizes. In this work, we proposed a lesion simulation method, MedLesSynth-LD, to overcome the lack of diversity in localised lesion characteristics for training robust segmentation models. In MedLesSynth-LD, we used noise models inherently based on the physics involved in the acquisition of modalities to generate sufficiently realistic lesion textures by perturbing healthy tissues. Later, we localised these perturbations within masks defined by composites of ellipsoids (thus forming random shapes) and blended them with the input image with varying contrast. The lesion simulation step does not require training and can be tailored to generate defined, localised lesions to introduce sufficient variability (in size, shape, texture and contrast) in the training data pool. We evaluated the performance of a downstream lesion segmentation task using simulated lesionsfor multiple publicly available datasets across imaging modalities and organs: Brain MRI for tumour and white matter hyperintensity segmentation, liver CT for tumour segmentation, breast ultrasound for tumour segmentation, and retinal fundus imaging for exudate segmentation. Using only 75% of labelled real-world data, the proposed method significantly improved lesion segmentation compared to real data-based fully supervised training with an 16% mean increase in the Dice score (DSC) and 33% mean decrease in the normalised 95th percentile of the Hausdorff distance (HD95 (norm)). The proposed method also performed better than state-of-the-art lesion segmentation methods in low-data regimes, with an 10% higher mean DSC and a 19% mean decrease in HD95 (norm). The source code is available at https://github.com/Ramanujam-N/MedLesSynth-LD [commit SHA cc2b15b].
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.