Praveenbalaji Rajendran, Yizheng Chen, Liang Qiu, Thomas Niedermayr, Wu Liu, Mark Buyyounouski, Hilary Bagshaw, Bin Han, Yong Yang, Nataliya Kovalchuk, Xuejun Gu, Steven Hancock, Lei Xing, Xianjin Dai
{"title":"Autodelineation of Treatment Target Volume for Radiation Therapy Using Large Language Model-Aided Multimodal Learning.","authors":"Praveenbalaji Rajendran, Yizheng Chen, Liang Qiu, Thomas Niedermayr, Wu Liu, Mark Buyyounouski, Hilary Bagshaw, Bin Han, Yong Yang, Nataliya Kovalchuk, Xuejun Gu, Steven Hancock, Lei Xing, Xianjin Dai","doi":"10.1016/j.ijrobp.2024.07.2149","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches.</p><p><strong>Methods and materials: </strong>A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects.</p><p><strong>Results: </strong>Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms.</p><p><strong>Conclusions: </strong>Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.</p>","PeriodicalId":14215,"journal":{"name":"International Journal of Radiation Oncology Biology Physics","volume":" ","pages":"230-240"},"PeriodicalIF":6.4000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Radiation Oncology Biology Physics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ijrobp.2024.07.2149","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/8/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Artificial intelligence-aided methods have made significant progress in the auto-delineation of normal tissues. However, these approaches struggle with the auto-contouring of radiation therapy target volume. Our goal was to model the delineation of target volume as a clinical decision-making problem, resolved by leveraging large language model-aided multimodal learning approaches.
Methods and materials: A vision-language model, termed Medformer, has been developed, employing the hierarchical vision transformer as its backbone and incorporating large language models to extract text-rich features. The contextually embedded linguistic features are seamlessly integrated into visual features for language-aware visual encoding through the visual language attention module. Metrics, including Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to quantitatively evaluate the performance of our model. The evaluation was conducted on an in-house prostate cancer data set and a public oropharyngeal carcinoma data set, totaling 668 subjects.
Results: Our Medformer achieved a DSC of 0.81 ± 0.10 versus 0.72 ± 0.10, IOU of 0.73 ± 0.12 versus 0.65 ± 0.09, and HD95 of 9.86 ± 9.77 mm versus 19.13 ± 12.96 mm for delineation of gross tumor volume on the prostate cancer dataset. Similarly, on the oropharyngeal carcinoma dataset, it achieved a DSC of 0.77 ± 0.11 versus 0.72 ± 0.09, IOU of 0.70 ± 0.09 versus 0.65 ± 0.07, and HD95 of 7.52 ± 4.8 mm versus 13.63 ± 7.13 mm, representing significant improvements (P < 0.05). For delineating the clinical target volume, Medformer achieved a DSC of 0.91 ± 0.04, IOU of 0.85 ± 0.05, and HD95 of 2.98 ± 1.60 mm, comparable with other state-of-the-art algorithms.
Conclusions: Auto-delineation of the treatment target based on multimodal learning outperforms conventional approaches that rely purely on visual features. Our method could be adopted into routine practice to rapidly contour clinical target volume/gross tumor volume.
期刊介绍:
International Journal of Radiation Oncology • Biology • Physics (IJROBP), known in the field as the Red Journal, publishes original laboratory and clinical investigations related to radiation oncology, radiation biology, medical physics, and both education and health policy as it relates to the field.
This journal has a particular interest in original contributions of the following types: prospective clinical trials, outcomes research, and large database interrogation. In addition, it seeks reports of high-impact innovations in single or combined modality treatment, tumor sensitization, normal tissue protection (including both precision avoidance and pharmacologic means), brachytherapy, particle irradiation, and cancer imaging. Technical advances related to dosimetry and conformal radiation treatment planning are of interest, as are basic science studies investigating tumor physiology and the molecular biology underlying cancer and normal tissue radiation response.