{"title":"CT-Denoimer: efficient contextual transformer network for low-dose CT denoising.","authors":"Yuanke Zhang, Fan Xu, Rui Zhang, Yanfei Guo, Hanxiang Wang, Bingbing Wei, Fei Ma, Jing Meng, Jianlei Liu, Hongbing Lu, Yang Chen","doi":"10.1088/1361-6560/addea6","DOIUrl":null,"url":null,"abstract":"<p><p><i>Objective</i>. Low-dose computed tomography (LDCT) effectively reduces radiation exposure to patients, but introduces severe noise artifacts that affect diagnostic accuracy. Recently, Transformer-based network architectures have been widely applied to LDCT image denoising, generally achieving superior results compared to traditional convolutional methods. However, these methods are often hindered by high computational costs and struggles in capturing complex local contextual features, which negatively impact denoising performance<i>Approach</i>. In this work, we propose CT-Denoimer, an efficient CT Denoising Transformer network that captures both global correlations and intricate, spatially varying local contextual details in CT images, enabling the generation of high-quality images. The core of our framework is a Transformer module that consists of two key components: the multi-Dconv head transposed attention (MDTA) and the mixed contextual feed-forward network (MCFN). The MDTA block captures global correlations in the image with linear computational complexity, while the MCFN block manages multi-scale local contextual information, both static and dynamic, through a series of Enhanced Contextual Transformer modules. In addition, we incorporate operation-wise attention layers to enable collaborative refinement in the proposed CT-Denoimer, enhancing its ability to more effectively handle complex and varying noise patterns in LDCT images<i>Main results</i>. Extensive experimental validation on both the AAPM-Mayo public dataset and a real-world clinical dataset demonstrated the state-of-the-art performance of the proposed CT-Denoimer. It achieved a peak signal-to-noise ratio of 33.681 dB, a structural similarity index measure of 0.921, an information fidelity criterion of 2.857 and a visual information fidelity of 0.349. Subjective assessment by radiologists gave an average score of 4.39, confirming its clinical applicability and clear advantages over existing methods<i>Significance</i>. This study presents an innovative CT denoising Transformer network that sets a new benchmark in LDCT image denoising, excelling in both noise reduction and fine structure preservation.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":" ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/addea6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objective. Low-dose computed tomography (LDCT) effectively reduces radiation exposure to patients, but introduces severe noise artifacts that affect diagnostic accuracy. Recently, Transformer-based network architectures have been widely applied to LDCT image denoising, generally achieving superior results compared to traditional convolutional methods. However, these methods are often hindered by high computational costs and struggles in capturing complex local contextual features, which negatively impact denoising performanceApproach. In this work, we propose CT-Denoimer, an efficient CT Denoising Transformer network that captures both global correlations and intricate, spatially varying local contextual details in CT images, enabling the generation of high-quality images. The core of our framework is a Transformer module that consists of two key components: the multi-Dconv head transposed attention (MDTA) and the mixed contextual feed-forward network (MCFN). The MDTA block captures global correlations in the image with linear computational complexity, while the MCFN block manages multi-scale local contextual information, both static and dynamic, through a series of Enhanced Contextual Transformer modules. In addition, we incorporate operation-wise attention layers to enable collaborative refinement in the proposed CT-Denoimer, enhancing its ability to more effectively handle complex and varying noise patterns in LDCT imagesMain results. Extensive experimental validation on both the AAPM-Mayo public dataset and a real-world clinical dataset demonstrated the state-of-the-art performance of the proposed CT-Denoimer. It achieved a peak signal-to-noise ratio of 33.681 dB, a structural similarity index measure of 0.921, an information fidelity criterion of 2.857 and a visual information fidelity of 0.349. Subjective assessment by radiologists gave an average score of 4.39, confirming its clinical applicability and clear advantages over existing methodsSignificance. This study presents an innovative CT denoising Transformer network that sets a new benchmark in LDCT image denoising, excelling in both noise reduction and fine structure preservation.
期刊介绍:
The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry