{"title":"增强大语言模型在宫颈癌高剂量率近距离治疗中的自动轮廓。","authors":"Jing Wang, Jiahan Zhang, Kaida Yang, Beth Bradshaw Ghavidel, Benyamin Khajetash, Abolfazl Sarikhani, Mohammad Houshyari, Tian Liu, Yang Lei, Meysam Tavakoli","doi":"10.1002/mp.70034","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> BACKGROUND</h3>\n \n <p>High-dose-rate brachytherapy (HDR-BT) is a cornerstone of cervical cancer (CC) treatment, requiring the precise delineation of high-risk clinical target volumes (HR-CTV) and organs at risk (OARs) for effective dose delivery and toxicity reduction. However, the time-sensitive nature of HDR-BT planning and its reliance on expert contouring introduce inter- and intra-observer variability, posing challenges for consistent and accurate treatment planning.</p>\n </section>\n \n <section>\n \n <h3> PURPOSE</h3>\n \n <p>This study proposes a novel deep learning (DL)-based auto-segmentation framework, guided by task-specific prompts generated from large language models (LLMs), to address these challenges and improve segmentation accuracy and efficiency.</p>\n </section>\n \n <section>\n \n <h3> METHODS</h3>\n \n <p>A retrospective dataset of 32 CC patients, encompassing 124 planning computed tomography (pCT) images, was utilized. The framework integrates clinical guidelines for organ contouring from the American Brachytherapy Society (ABS), the European Society for Radiotherapy and Oncology (ESTRO), and the International Commission on Radiation Units and Measurements (ICRU). LLMs, particularly Chat-GPT, extracts domain knowledge from these contouring guidelines to generate task-specific prompts, which guide a Swin transformer-based encoder and a fully convolutional network (FCN) decoder for segmentation. The DL pipeline was evaluated on HR-CTV and OARs, including the bladder, rectum, and sigmoid. Metrics such as Dice similarity coefficient (DSC), Hausdorff distance (HD95%), mean surface distance (MSD), and center-of-mass distance (CMD) were used for performance assessment. An ablation study compared the prompt-guided approach with a baseline model without prompt guidance. Statistical differences were tested with two-tailed paired <i>t</i>-tests, and <i>p</i>-values were adjusted using the Benjamini–Hochberg method to address the multiple comparisons correction and results with adjusted <i>p</i> < 0.05 were deemed significant. Cohen's d values were calculated to quantify effect sizes.</p>\n </section>\n \n <section>\n \n <h3> RESULTS</h3>\n \n <p>The proposed framework achieved the highest segmentation for the bladder (DSC of 0.91 ± 0.07), followed by the HR-CTV (DSC of 0.80 ± 0.08) and the rectum (DSC of 0.78 ± 0.07), and a lower accuracy for sigmoid (DSC of 0.63 ± 0.15) due to its small size and irregular shape. Boundary precision was highest for the HR-CTV (HD95%: 6.32 ± 2.31 mm). The ablation study confirmed the contribution of prompt guidance, with statistically significant improvements in DSC and/or HD95% (<i>p</i> < 0.05) for all OARs. Prompt guidance, however, did not improve the accuracy of HR-CTV delineation.</p>\n </section>\n \n <section>\n \n <h3> CONCLUSIONS</h3>\n \n <p>This study demonstrates the feasibility and effectiveness of integrating LLM-generated task-specific prompts with DL-based segmentation for HDR-BT in CC. The proposed framework enhances segmentation consistency to support accurate treatment planning, addressing critical challenges in HDR-BT workflows.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 10","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing auto-contouring with large language model in high-dose rate brachytherapy for cervical cancers\",\"authors\":\"Jing Wang, Jiahan Zhang, Kaida Yang, Beth Bradshaw Ghavidel, Benyamin Khajetash, Abolfazl Sarikhani, Mohammad Houshyari, Tian Liu, Yang Lei, Meysam Tavakoli\",\"doi\":\"10.1002/mp.70034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> BACKGROUND</h3>\\n \\n <p>High-dose-rate brachytherapy (HDR-BT) is a cornerstone of cervical cancer (CC) treatment, requiring the precise delineation of high-risk clinical target volumes (HR-CTV) and organs at risk (OARs) for effective dose delivery and toxicity reduction. However, the time-sensitive nature of HDR-BT planning and its reliance on expert contouring introduce inter- and intra-observer variability, posing challenges for consistent and accurate treatment planning.</p>\\n </section>\\n \\n <section>\\n \\n <h3> PURPOSE</h3>\\n \\n <p>This study proposes a novel deep learning (DL)-based auto-segmentation framework, guided by task-specific prompts generated from large language models (LLMs), to address these challenges and improve segmentation accuracy and efficiency.</p>\\n </section>\\n \\n <section>\\n \\n <h3> METHODS</h3>\\n \\n <p>A retrospective dataset of 32 CC patients, encompassing 124 planning computed tomography (pCT) images, was utilized. The framework integrates clinical guidelines for organ contouring from the American Brachytherapy Society (ABS), the European Society for Radiotherapy and Oncology (ESTRO), and the International Commission on Radiation Units and Measurements (ICRU). LLMs, particularly Chat-GPT, extracts domain knowledge from these contouring guidelines to generate task-specific prompts, which guide a Swin transformer-based encoder and a fully convolutional network (FCN) decoder for segmentation. The DL pipeline was evaluated on HR-CTV and OARs, including the bladder, rectum, and sigmoid. Metrics such as Dice similarity coefficient (DSC), Hausdorff distance (HD95%), mean surface distance (MSD), and center-of-mass distance (CMD) were used for performance assessment. An ablation study compared the prompt-guided approach with a baseline model without prompt guidance. Statistical differences were tested with two-tailed paired <i>t</i>-tests, and <i>p</i>-values were adjusted using the Benjamini–Hochberg method to address the multiple comparisons correction and results with adjusted <i>p</i> < 0.05 were deemed significant. Cohen's d values were calculated to quantify effect sizes.</p>\\n </section>\\n \\n <section>\\n \\n <h3> RESULTS</h3>\\n \\n <p>The proposed framework achieved the highest segmentation for the bladder (DSC of 0.91 ± 0.07), followed by the HR-CTV (DSC of 0.80 ± 0.08) and the rectum (DSC of 0.78 ± 0.07), and a lower accuracy for sigmoid (DSC of 0.63 ± 0.15) due to its small size and irregular shape. Boundary precision was highest for the HR-CTV (HD95%: 6.32 ± 2.31 mm). The ablation study confirmed the contribution of prompt guidance, with statistically significant improvements in DSC and/or HD95% (<i>p</i> < 0.05) for all OARs. Prompt guidance, however, did not improve the accuracy of HR-CTV delineation.</p>\\n </section>\\n \\n <section>\\n \\n <h3> CONCLUSIONS</h3>\\n \\n <p>This study demonstrates the feasibility and effectiveness of integrating LLM-generated task-specific prompts with DL-based segmentation for HDR-BT in CC. The proposed framework enhances segmentation consistency to support accurate treatment planning, addressing critical challenges in HDR-BT workflows.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 10\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.70034\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://aapm.onlinelibrary.wiley.com/doi/10.1002/mp.70034","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Enhancing auto-contouring with large language model in high-dose rate brachytherapy for cervical cancers
BACKGROUND
High-dose-rate brachytherapy (HDR-BT) is a cornerstone of cervical cancer (CC) treatment, requiring the precise delineation of high-risk clinical target volumes (HR-CTV) and organs at risk (OARs) for effective dose delivery and toxicity reduction. However, the time-sensitive nature of HDR-BT planning and its reliance on expert contouring introduce inter- and intra-observer variability, posing challenges for consistent and accurate treatment planning.
PURPOSE
This study proposes a novel deep learning (DL)-based auto-segmentation framework, guided by task-specific prompts generated from large language models (LLMs), to address these challenges and improve segmentation accuracy and efficiency.
METHODS
A retrospective dataset of 32 CC patients, encompassing 124 planning computed tomography (pCT) images, was utilized. The framework integrates clinical guidelines for organ contouring from the American Brachytherapy Society (ABS), the European Society for Radiotherapy and Oncology (ESTRO), and the International Commission on Radiation Units and Measurements (ICRU). LLMs, particularly Chat-GPT, extracts domain knowledge from these contouring guidelines to generate task-specific prompts, which guide a Swin transformer-based encoder and a fully convolutional network (FCN) decoder for segmentation. The DL pipeline was evaluated on HR-CTV and OARs, including the bladder, rectum, and sigmoid. Metrics such as Dice similarity coefficient (DSC), Hausdorff distance (HD95%), mean surface distance (MSD), and center-of-mass distance (CMD) were used for performance assessment. An ablation study compared the prompt-guided approach with a baseline model without prompt guidance. Statistical differences were tested with two-tailed paired t-tests, and p-values were adjusted using the Benjamini–Hochberg method to address the multiple comparisons correction and results with adjusted p < 0.05 were deemed significant. Cohen's d values were calculated to quantify effect sizes.
RESULTS
The proposed framework achieved the highest segmentation for the bladder (DSC of 0.91 ± 0.07), followed by the HR-CTV (DSC of 0.80 ± 0.08) and the rectum (DSC of 0.78 ± 0.07), and a lower accuracy for sigmoid (DSC of 0.63 ± 0.15) due to its small size and irregular shape. Boundary precision was highest for the HR-CTV (HD95%: 6.32 ± 2.31 mm). The ablation study confirmed the contribution of prompt guidance, with statistically significant improvements in DSC and/or HD95% (p < 0.05) for all OARs. Prompt guidance, however, did not improve the accuracy of HR-CTV delineation.
CONCLUSIONS
This study demonstrates the feasibility and effectiveness of integrating LLM-generated task-specific prompts with DL-based segmentation for HDR-BT in CC. The proposed framework enhances segmentation consistency to support accurate treatment planning, addressing critical challenges in HDR-BT workflows.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.