LoMAE: Simple Streamlined Low-level Masked Autoencoders for Robust, Generalized, and Interpretable Low-dose CT Denoising.

IF 6.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Journal of Biomedical and Health Informatics Pub Date : 2024-09-05 DOI:10.1109/JBHI.2024.3454979

Dayang Wang, Shuo Han, Yongshun Xu, Zhan Wu, Li Zhou, Bahareh Morovati, Hengyong Yu

{"title":"LoMAE: Simple Streamlined Low-level Masked Autoencoders for Robust, Generalized, and Interpretable Low-dose CT Denoising.","authors":"Dayang Wang, Shuo Han, Yongshun Xu, Zhan Wu, Li Zhou, Bahareh Morovati, Hengyong Yu","doi":"10.1109/JBHI.2024.3454979","DOIUrl":null,"url":null,"abstract":"<p><p>Low-dose computed tomography (LDCT) offers reduced X-ray radiation exposure but at the cost of compromised image quality, characterized by increased noise and artifacts. Recently, transformer models emerged as a promising avenue to enhance LDCT image quality. However, the success of such models relies on a large amount of paired noisy and clean images, which are often scarce in clinical settings. In computer vision and natural language processing, masked autoencoders (MAE) have been recognized as a powerful self-pretraining method for transformers, due to their exceptional capability to extract representative features. However, the original pretraining and fine-tuning design fails to work in low-level vision tasks like denoising. In response to this challenge, we redesign the classical encoder-decoder learning model and facilitate a simple yet effective streamlined low-level vision MAE, referred to as LoMAE, tailored to address the LDCT denoising problem. Moreover, we introduce an MAE-GradCAM method to shed light on the latent learning mechanisms of the MAE/LoMAE. Additionally, we explore the LoMAE's robustness and generability across a variety of noise levels. Experimental findings show that the proposed LoMAE enhances the denoising capabilities of the transformer and substantially reduce their dependency on high-quality, ground-truth data. It also demonstrates remarkable robustness and generalizability over a spectrum of noise levels. In summary, the proposed LoMAE provides promising solutions to the major issues in LDCT including interpretability, ground truth data dependency, and model robustness/generalizability.</p>","PeriodicalId":13073,"journal":{"name":"IEEE Journal of Biomedical and Health Informatics","volume":null,"pages":null},"PeriodicalIF":6.7000,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Biomedical and Health Informatics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/JBHI.2024.3454979","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Low-dose computed tomography (LDCT) offers reduced X-ray radiation exposure but at the cost of compromised image quality, characterized by increased noise and artifacts. Recently, transformer models emerged as a promising avenue to enhance LDCT image quality. However, the success of such models relies on a large amount of paired noisy and clean images, which are often scarce in clinical settings. In computer vision and natural language processing, masked autoencoders (MAE) have been recognized as a powerful self-pretraining method for transformers, due to their exceptional capability to extract representative features. However, the original pretraining and fine-tuning design fails to work in low-level vision tasks like denoising. In response to this challenge, we redesign the classical encoder-decoder learning model and facilitate a simple yet effective streamlined low-level vision MAE, referred to as LoMAE, tailored to address the LDCT denoising problem. Moreover, we introduce an MAE-GradCAM method to shed light on the latent learning mechanisms of the MAE/LoMAE. Additionally, we explore the LoMAE's robustness and generability across a variety of noise levels. Experimental findings show that the proposed LoMAE enhances the denoising capabilities of the transformer and substantially reduce their dependency on high-quality, ground-truth data. It also demonstrates remarkable robustness and generalizability over a spectrum of noise levels. In summary, the proposed LoMAE provides promising solutions to the major issues in LDCT including interpretability, ground truth data dependency, and model robustness/generalizability.

查看原文本刊更多论文

LoMAE：用于鲁棒、通用和可解释低剂量 CT 去噪的简单精简低水平掩蔽自动编码器。

低剂量计算机断层扫描（LDCT）可减少 X 射线辐射量，但其代价是图像质量受到影响，表现为噪声和伪影增加。最近，变压器模型成为提高 LDCT 图像质量的一个很有前景的途径。然而，此类模型的成功依赖于大量成对的噪声图像和干净图像，而这些图像在临床环境中往往很少。在计算机视觉和自然语言处理领域，掩码自动编码器（MAE）因其提取代表性特征的卓越能力，已被公认为是一种强大的变换器自我预训练方法。然而，原始的预训练和微调设计无法在去噪等低级视觉任务中发挥作用。为了应对这一挑战，我们重新设计了经典的编码器-解码器学习模型，并为解决 LDCT 去噪问题量身定制了一种简单而有效的精简低级视觉 MAE（简称 LoMAE）。此外，我们还引入了 MAE-GradCAM 方法，以揭示 MAE/LoMAE 的潜在学习机制。此外，我们还探讨了 LoMAE 在各种噪声水平下的鲁棒性和通用性。实验结果表明，所提出的 LoMAE 增强了变压器的去噪能力，并大大降低了对高质量地面实况数据的依赖性。此外，它还在各种噪声水平上表现出了卓越的鲁棒性和通用性。总之，所提出的 LoMAE 为 LDCT 的主要问题提供了有希望的解决方案，包括可解释性、对地面实况数据的依赖性以及模型的鲁棒性/通用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Journal of Biomedical and Health Informatics COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

CiteScore

13.60

自引率

6.50%

发文量

1151

期刊介绍： IEEE Journal of Biomedical and Health Informatics publishes original papers presenting recent advances where information and communication technologies intersect with health, healthcare, life sciences, and biomedicine. Topics include acquisition, transmission, storage, retrieval, management, and analysis of biomedical and health information. The journal covers applications of information technologies in healthcare, patient monitoring, preventive care, early disease diagnosis, therapy discovery, and personalized treatment protocols. It explores electronic medical and health records, clinical information systems, decision support systems, medical and biological imaging informatics, wearable systems, body area/sensor networks, and more. Integration-related topics like interoperability, evidence-based medicine, and secure patient data are also addressed.