Yafei Dong, Thibault Marin, Yue Zhuo, Elie Najem, Arnaud Beddok, Laura Rozenblum, Maryam Moteabbed, Kira Grogg, Fangxu Xing, Jonghye Woo, Yen-Lin E Chen, Ruth Lim, Xiaofeng Liu, Chao Ma, Georges El Fakhri
{"title":"利用扩散模型模拟软组织肉瘤临床靶体积描绘的解读器间变异性。","authors":"Yafei Dong, Thibault Marin, Yue Zhuo, Elie Najem, Arnaud Beddok, Laura Rozenblum, Maryam Moteabbed, Kira Grogg, Fangxu Xing, Jonghye Woo, Yen-Lin E Chen, Ruth Lim, Xiaofeng Liu, Chao Ma, Georges El Fakhri","doi":"10.1002/mp.17865","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Accurate delineation of the clinical target volume (CTV) is essential in the radiotherapy treatment of soft tissue sarcomas. However, this process is subject to inter-reader variability due to the need for clinical assessment of risk and extent of potential microscopic spread. This can lead to inconsistencies in treatment planning, potentially impacting treatment outcomes. Most existing automatic CTV delineation methods do not account for this variability and can only generate a single CTV for each case.</p><p><strong>Purpose: </strong>This study aims to develop a deep learning-based technique to generate multiple CTV contours for each case, simulating the inter-reader variability in the clinical practice.</p><p><strong>Methods: </strong>We employed a publicly available dataset consisting of fluorodeoxyglucose positron emission tomography (FDG-PET), x-ray computed tomography (CT), and pre-contrast T1-weighted magnetic resonance imaging (MRI) scans from 51 patients with soft tissue sarcoma, along with an independent validation set containing five additional patients. An experienced reader drew a contour of the gross tumor volume (GTV) for each patient based on multi-modality images. Subsequently, two additional readers, together with the first one, were responsible for contouring three CTVs in total based on the GTV. We developed a diffusion model-based deep learning method that is capable of generating arbitrary number of different and plausible CTVs to mimic the inter-reader variability in CTV delineation. The proposed model incorporates a separate encoder to extract features from the GTV masks, leveraging the critical role of GTV information in accurate CTV delineation.</p><p><strong>Results: </strong>The proposed diffusion model demonstrated superior performance with the highest Dice Index (0.902 compared to values below 0.881 for state-of-the-art models) and the best generalized energy distance (GED) (0.209 compared to values exceeding 0.221 for state-of-the-art models). It also achieved the second-highest recall and precision metrics among the compared ambiguous image segmentation models. Results from both datasets exhibited consistent trends, reinforcing the reliability of our findings. Additionally, ablation studies exploring different model structures and input configurations highlighted the significance of incorporating prior GTV information for accurate CTV delineation.</p><p><strong>Conclusions: </strong>The proposed diffusion model successfully generates multiple plausible CTV contours for soft tissue sarcomas, effectively capturing inter-reader variability in CTV delineation.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Modeling inter-reader variability in clinical target volume delineation for soft tissue sarcomas using diffusion model.\",\"authors\":\"Yafei Dong, Thibault Marin, Yue Zhuo, Elie Najem, Arnaud Beddok, Laura Rozenblum, Maryam Moteabbed, Kira Grogg, Fangxu Xing, Jonghye Woo, Yen-Lin E Chen, Ruth Lim, Xiaofeng Liu, Chao Ma, Georges El Fakhri\",\"doi\":\"10.1002/mp.17865\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Accurate delineation of the clinical target volume (CTV) is essential in the radiotherapy treatment of soft tissue sarcomas. However, this process is subject to inter-reader variability due to the need for clinical assessment of risk and extent of potential microscopic spread. This can lead to inconsistencies in treatment planning, potentially impacting treatment outcomes. Most existing automatic CTV delineation methods do not account for this variability and can only generate a single CTV for each case.</p><p><strong>Purpose: </strong>This study aims to develop a deep learning-based technique to generate multiple CTV contours for each case, simulating the inter-reader variability in the clinical practice.</p><p><strong>Methods: </strong>We employed a publicly available dataset consisting of fluorodeoxyglucose positron emission tomography (FDG-PET), x-ray computed tomography (CT), and pre-contrast T1-weighted magnetic resonance imaging (MRI) scans from 51 patients with soft tissue sarcoma, along with an independent validation set containing five additional patients. An experienced reader drew a contour of the gross tumor volume (GTV) for each patient based on multi-modality images. Subsequently, two additional readers, together with the first one, were responsible for contouring three CTVs in total based on the GTV. We developed a diffusion model-based deep learning method that is capable of generating arbitrary number of different and plausible CTVs to mimic the inter-reader variability in CTV delineation. The proposed model incorporates a separate encoder to extract features from the GTV masks, leveraging the critical role of GTV information in accurate CTV delineation.</p><p><strong>Results: </strong>The proposed diffusion model demonstrated superior performance with the highest Dice Index (0.902 compared to values below 0.881 for state-of-the-art models) and the best generalized energy distance (GED) (0.209 compared to values exceeding 0.221 for state-of-the-art models). It also achieved the second-highest recall and precision metrics among the compared ambiguous image segmentation models. Results from both datasets exhibited consistent trends, reinforcing the reliability of our findings. Additionally, ablation studies exploring different model structures and input configurations highlighted the significance of incorporating prior GTV information for accurate CTV delineation.</p><p><strong>Conclusions: </strong>The proposed diffusion model successfully generates multiple plausible CTV contours for soft tissue sarcomas, effectively capturing inter-reader variability in CTV delineation.</p>\",\"PeriodicalId\":94136,\"journal\":{\"name\":\"Medical physics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/mp.17865\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17865","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Modeling inter-reader variability in clinical target volume delineation for soft tissue sarcomas using diffusion model.
Background: Accurate delineation of the clinical target volume (CTV) is essential in the radiotherapy treatment of soft tissue sarcomas. However, this process is subject to inter-reader variability due to the need for clinical assessment of risk and extent of potential microscopic spread. This can lead to inconsistencies in treatment planning, potentially impacting treatment outcomes. Most existing automatic CTV delineation methods do not account for this variability and can only generate a single CTV for each case.
Purpose: This study aims to develop a deep learning-based technique to generate multiple CTV contours for each case, simulating the inter-reader variability in the clinical practice.
Methods: We employed a publicly available dataset consisting of fluorodeoxyglucose positron emission tomography (FDG-PET), x-ray computed tomography (CT), and pre-contrast T1-weighted magnetic resonance imaging (MRI) scans from 51 patients with soft tissue sarcoma, along with an independent validation set containing five additional patients. An experienced reader drew a contour of the gross tumor volume (GTV) for each patient based on multi-modality images. Subsequently, two additional readers, together with the first one, were responsible for contouring three CTVs in total based on the GTV. We developed a diffusion model-based deep learning method that is capable of generating arbitrary number of different and plausible CTVs to mimic the inter-reader variability in CTV delineation. The proposed model incorporates a separate encoder to extract features from the GTV masks, leveraging the critical role of GTV information in accurate CTV delineation.
Results: The proposed diffusion model demonstrated superior performance with the highest Dice Index (0.902 compared to values below 0.881 for state-of-the-art models) and the best generalized energy distance (GED) (0.209 compared to values exceeding 0.221 for state-of-the-art models). It also achieved the second-highest recall and precision metrics among the compared ambiguous image segmentation models. Results from both datasets exhibited consistent trends, reinforcing the reliability of our findings. Additionally, ablation studies exploring different model structures and input configurations highlighted the significance of incorporating prior GTV information for accurate CTV delineation.
Conclusions: The proposed diffusion model successfully generates multiple plausible CTV contours for soft tissue sarcomas, effectively capturing inter-reader variability in CTV delineation.