High-dose-rate brachytherapy (HDR-BT) is a cornerstone of cervical cancer (CC) treatment, requiring the precise delineation of high-risk clinical target volumes (HR-CTV) and organs at risk (OARs) for effective dose delivery and toxicity reduction. However, the time-sensitive nature of HDR-BT planning and its reliance on expert contouring introduce inter- and intra-observer variability, posing challenges for consistent and accurate treatment planning.
This study proposes a novel deep learning (DL)-based auto-segmentation framework, guided by task-specific prompts generated from large language models (LLMs), to address these challenges and improve segmentation accuracy and efficiency.
A retrospective dataset of 32 CC patients, encompassing 124 planning computed tomography (pCT) images, was utilized. The framework integrates clinical guidelines for organ contouring from the American Brachytherapy Society (ABS), the European Society for Radiotherapy and Oncology (ESTRO), and the International Commission on Radiation Units and Measurements (ICRU). LLMs, particularly Chat-GPT, extracts domain knowledge from these contouring guidelines to generate task-specific prompts, which guide a Swin transformer-based encoder and a fully convolutional network (FCN) decoder for segmentation. The DL pipeline was evaluated on HR-CTV and OARs, including the bladder, rectum, and sigmoid. Metrics such as Dice similarity coefficient (DSC), Hausdorff distance (HD95%), mean surface distance (MSD), and center-of-mass distance (CMD) were used for performance assessment. An ablation study compared the prompt-guided approach with a baseline model without prompt guidance. Statistical differences were tested with two-tailed paired t-tests, and p-values were adjusted using the Benjamini–Hochberg method to address the multiple comparisons correction and results with adjusted p < 0.05 were deemed significant. Cohen's d values were calculated to quantify effect sizes.
The proposed framework achieved the highest segmentation for the bladder (DSC of 0.91 ± 0.07), followed by the HR-CTV (DSC of 0.80 ± 0.08) and the rectum (DSC of 0.78 ± 0.07), and a lower accuracy for sigmoid (DSC of 0.63 ± 0.15) due to its small size and irregular shape. Boundary precision was highest for the HR-CTV (HD95%: 6.32 ± 2.31 mm). The ablation study confirmed the contribution of prompt guidance, with statistically significant improvements in DSC and/or HD95% (p < 0.05) for all OARs. Prompt guidance, however, did not improve the accuracy of HR-CTV delineation.
This study demonstrates the feasibility and effectiveness of integrating LLM-generated task-specific prompts with DL-based segmentation for HDR-BT in CC. The proposed framework enhances segmentation consistency to support accurate treatment planning, addressing critical challenges in HDR-BT workflows.