{"title":"3D lymphoma segmentation on PET/CT images via multi-scale information fusion with cross-attention.","authors":"Huan Huang, Liheng Qiu, Shenmiao Yang, Longxi Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Chen Zhao, Weihua Zhou","doi":"10.1002/mp.17763","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Accurate segmentation of diffuse large B-cell lymphoma (DLBCL) lesions is challenging due to their complex patterns in medical imaging. Traditional methods often struggle to delineate these lesions accurately.</p><p><strong>Objective: </strong>This study aims to develop a precise segmentation method for DLBCL using 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography (PET) and computed tomography (CT) images.</p><p><strong>Methods: </strong>We propose a 3D segmentation method based on an encoder-decoder architecture. The encoder incorporates a dual-branch design based on the shifted window transformer to extract features from both PET and CT modalities. To enhance feature integration, we introduce a multi-scale information fusion (MSIF) module that performs multi-scale feature fusion using cross-attention mechanisms with a shifted window framework. A gated neural network within the MSIF module dynamically adjusts feature weights to balance the contributions from each modality. The model is optimized using the dice similarity coefficient (DSC) loss function, minimizing discrepancies between the model prediction and ground truth. Additionally, we computed the total metabolic tumor volume (TMTV) and performed statistical analyses on the results.</p><p><strong>Results: </strong>The model was trained and validated on a private dataset of 165 DLBCL patients and a publicly available dataset (autoPET) containing 145 PET/CT scans of lymphoma patients. Both datasets were analyzed using five-fold cross-validation. On the private dataset, our model achieved a DSC of 0.7512, sensitivity of 0.7548, precision of 0.7611, an average surface distance (ASD) of 3.61 mm, and a Hausdorff distance at the 95th percentile (HD95) of 15.25 mm. On the autoPET dataset, the model achieved a DSC of 0.7441, sensitivity of 0.7573, precision of 0.7427, ASD of 5.83 mm, and HD95 of 21.27 mm, outperforming state-of-the-art methods (p < 0.05, t-test). For TMTV quantification, Pearson correlation coefficients of 0.91 (private dataset) and 0.86 (autoPET) were observed, with R<sup>2</sup> values of 0.89 and 0.75, respectively. Extensive ablation studies demonstrated the MSIF module's contribution to enhanced segmentation accuracy.</p><p><strong>Conclusion: </strong>This study presents an effective automatic segmentation method for DLBCL that leverages the complementary strengths of PET and CT imaging. The method demonstrates robust performance on both private and publicly available datasets, ensuring its reliability and generalizability. Our method provides clinicians with more precise tumor delineation, which can improve the accuracy of diagnostic interpretations and assist in treatment planning for DLBCL patients. The code for the proposed method is available at https://github.com/chenzhao2023/lymphoma_seg.</p>","PeriodicalId":94136,"journal":{"name":"Medical physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/mp.17763","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Accurate segmentation of diffuse large B-cell lymphoma (DLBCL) lesions is challenging due to their complex patterns in medical imaging. Traditional methods often struggle to delineate these lesions accurately.
Objective: This study aims to develop a precise segmentation method for DLBCL using 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography (PET) and computed tomography (CT) images.
Methods: We propose a 3D segmentation method based on an encoder-decoder architecture. The encoder incorporates a dual-branch design based on the shifted window transformer to extract features from both PET and CT modalities. To enhance feature integration, we introduce a multi-scale information fusion (MSIF) module that performs multi-scale feature fusion using cross-attention mechanisms with a shifted window framework. A gated neural network within the MSIF module dynamically adjusts feature weights to balance the contributions from each modality. The model is optimized using the dice similarity coefficient (DSC) loss function, minimizing discrepancies between the model prediction and ground truth. Additionally, we computed the total metabolic tumor volume (TMTV) and performed statistical analyses on the results.
Results: The model was trained and validated on a private dataset of 165 DLBCL patients and a publicly available dataset (autoPET) containing 145 PET/CT scans of lymphoma patients. Both datasets were analyzed using five-fold cross-validation. On the private dataset, our model achieved a DSC of 0.7512, sensitivity of 0.7548, precision of 0.7611, an average surface distance (ASD) of 3.61 mm, and a Hausdorff distance at the 95th percentile (HD95) of 15.25 mm. On the autoPET dataset, the model achieved a DSC of 0.7441, sensitivity of 0.7573, precision of 0.7427, ASD of 5.83 mm, and HD95 of 21.27 mm, outperforming state-of-the-art methods (p < 0.05, t-test). For TMTV quantification, Pearson correlation coefficients of 0.91 (private dataset) and 0.86 (autoPET) were observed, with R2 values of 0.89 and 0.75, respectively. Extensive ablation studies demonstrated the MSIF module's contribution to enhanced segmentation accuracy.
Conclusion: This study presents an effective automatic segmentation method for DLBCL that leverages the complementary strengths of PET and CT imaging. The method demonstrates robust performance on both private and publicly available datasets, ensuring its reliability and generalizability. Our method provides clinicians with more precise tumor delineation, which can improve the accuracy of diagnostic interpretations and assist in treatment planning for DLBCL patients. The code for the proposed method is available at https://github.com/chenzhao2023/lymphoma_seg.