Deep learning methods are promising in automating segmentation of organs at risk (OARs) in radiotherapy. However, the lack of a geometric indicator for dosimetry accuracy remains to be a problem. This issue is particularly pronounced in specific radiotherapy treatments where only the proximity of structures to the radiotherapy target affects the dose planning. In cervical cancer high dose-rate (HDR) brachytherapy, treatment planning is motivated by limiting dose to the hottest 2 cubic centimeters (D2cm3) of the OARs. Similarly, Ethos online adaptive radiotherapy system prioritizes only the closest target structures for adaptive plan generation.
We propose a novel geometrically focused deep learning training method and evaluation metric, using cervical brachytherapy as a case study. A distance-penalized (DP) loss function was developed to focus attention on the near-to-target OAR regions. We also introduced and evaluated a novel geometric metric, weighted dice similarity coefficient (wDSC), correlated with OARs D2cm3.
A model was trained using a 3D U-Net architecture and 170 T2-weighted magnetic resonance (MR) images (56 patients) with clinical contours. The dataset was split into subsets at the patient level: 45 patients (150 scans) as the training set for five-fold cross-validation and 11 patients (20 scans) as the testing set. Another dataset from our institution, consisting of 35 MR scans from 22 cervical cancer patients, was used as an independent internal testing set. A distance map, emphasizing errors near high-risk clinical target volume (CTVHR), was used to penalize two commonly used loss functions, cross-entropy (CE) loss and DiceCE loss. The wDSC emphasizes the accuracy of OAR regions proximal to CTVHR by incorporating a weighted factor in the original vDSC. The Pearson correlation coefficient (r) was used to quantify the strength of the relationship between D2cm3 accuracy and six evaluation metrics (wDSC and five standard metrics). A physician rated and revised the auto-contours for the clinical acceptability tests.
The wDSC moderately correlated (r = -0.55) with D2cm3 accuracy, outperforming standard geometric metrics. Models using DP loss functions consistently yielded higher wDSCs compared to their respective non-DP counterparts. DP loss models also improved D2cm3 accuracy, indicating an enhanced accuracy in dosimetry. The clinical acceptability tests revealed that more than 94% of bladder and rectum contours and approximately half of the sigmoid and small bowel contours were clinically accepted.
We developed and evaluated a new geometric metric, wDSC, as a better indicator of D2cm3 accuracy, which has the potential to become a surrogate for dosimetric accuracy in cervical brachytherapy. The model with DP loss showed non-statistically significant improvements in geometric and dosimetric performance. This work also holds the potential to be used for precise OARs delineation in adaptive radiotherapy.