Fanning Kong, Zaifeng Shi, Huaisheng Cao, Yudong Hao, Qingjie Cao
{"title":"Self-supervised U-transformer network with mask reconstruction for metal artifact reduction.","authors":"Fanning Kong, Zaifeng Shi, Huaisheng Cao, Yudong Hao, Qingjie Cao","doi":"10.1088/1361-6560/adbaae","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective</i>. Metal artifacts severely damaged human tissue information from the computed tomography (CT) image, posing significant challenges to disease diagnosis. Deep learning has been widely explored for the metal artifact reduction (MAR) task. Nevertheless, paired metal artifact CT datasets suitable for training do not exist in reality. Although the synthetic CT image dataset provides additional training data, the trained networks still generalize poorly to real metal artifact data.<i>Approach.</i>A self-supervised U-shaped transformer network is proposed to focus on model generalizability enhancement in MAR tasks. This framework consists of a self-supervised mask reconstruction pre-text task and a down-stream task. In the pre-text task, the CT images are randomly corrupted by masks. They are recovered with themselves as the label, aiming at acquiring the artifacts and tissue structure of the actual physical situation. Down-stream task fine-tunes MAR target through labeled images. Utilizing the multi-layer long-range feature extraction capabilities of the Transformer efficiently captures features of metal artifacts. The incorporation of the MAR bottleneck allows for the distinction of metal artifact features through cross-channel self-attention.<i>Main result</i>. Experiments demonstrate that the framework maintains strong generalization ability in the MAR task, effectively preserving tissue details while suppressing metal artifacts. The results achieved a peak signal-to-noise ratio of 43.86 dB and a structural similarity index of 0.9863 while ensuring the efficiency of the model inference. In addition, the Dice coefficient and mean intersection over union are improved by 11.70% and 9.51% in the segmentation of the MAR image, respectively.<i>Significance.</i>The combination of unlabeled real-artifact CT images and labeled synthetic-artifact CT images facilitates a self-supervised learning process that positively contributes to model generalizability.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/adbaae","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objective. Metal artifacts severely damaged human tissue information from the computed tomography (CT) image, posing significant challenges to disease diagnosis. Deep learning has been widely explored for the metal artifact reduction (MAR) task. Nevertheless, paired metal artifact CT datasets suitable for training do not exist in reality. Although the synthetic CT image dataset provides additional training data, the trained networks still generalize poorly to real metal artifact data.Approach.A self-supervised U-shaped transformer network is proposed to focus on model generalizability enhancement in MAR tasks. This framework consists of a self-supervised mask reconstruction pre-text task and a down-stream task. In the pre-text task, the CT images are randomly corrupted by masks. They are recovered with themselves as the label, aiming at acquiring the artifacts and tissue structure of the actual physical situation. Down-stream task fine-tunes MAR target through labeled images. Utilizing the multi-layer long-range feature extraction capabilities of the Transformer efficiently captures features of metal artifacts. The incorporation of the MAR bottleneck allows for the distinction of metal artifact features through cross-channel self-attention.Main result. Experiments demonstrate that the framework maintains strong generalization ability in the MAR task, effectively preserving tissue details while suppressing metal artifacts. The results achieved a peak signal-to-noise ratio of 43.86 dB and a structural similarity index of 0.9863 while ensuring the efficiency of the model inference. In addition, the Dice coefficient and mean intersection over union are improved by 11.70% and 9.51% in the segmentation of the MAR image, respectively.Significance.The combination of unlabeled real-artifact CT images and labeled synthetic-artifact CT images facilitates a self-supervised learning process that positively contributes to model generalizability.
期刊介绍:
The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry