Muhammad Nadeem Cheema, Lei Zhang, Anam Nazir, Yiran Li, John A Detre, Ze Wang
{"title":"Transformer-based arterial spin labeling perfusion MRI denoising.","authors":"Muhammad Nadeem Cheema, Lei Zhang, Anam Nazir, Yiran Li, John A Detre, Ze Wang","doi":"10.1007/s00371-025-04061-x","DOIUrl":null,"url":null,"abstract":"<p><p>Arterial Spin Labeling (ASL) perfusion MRI is the only non-invasive technique for quantifying regional cerebral blood flow (CBF) visualization, which is an important physiological variable. ASL MRI has a relatively low signal-to-noise-ratio (SNR), making it challenging to achieve high quality CBF images using limited data. Promising ASL CBF denoising results have been shown in recent convolutional neural network (CNN)-based methods. A common problem of these methods is the loss of output image texture and image intensity variabilities. To address this problem, we proposed a Hybrid U-Net and Swin Transformer (HUST) ASL CBF denoising method. Transformers explicitly encode spatial positions of input data and can learn features with long-range dependency. These features can substantially mitigate the image blurring issue and preserve individual data variability. We used the U-Net as the network backbone due to its demonstrated capability for capturing local and global features and replaced the original CNNs layers with transformers. Swin Transformer was used to reduce the number of parameters required by a regular transformer for image denoising. Reduction in parameters is achieved by hierarchical structure along with shifting window-based attention mechanism. The proposed method is trained and tested with 2D and 3D ASL CBF images, HUST substantially improved CBF image visualization and preserved image textures. The 2D data were acquired from 277 normal healthy subjects aged 23 to 47, 110 males, and 167 females were included. The 3D data (110 subjects) were pooled from a local database and were acquired using our background suppressed 3D stack of spirals fast spin echo pseudo-continuous ASL sequence <sup>27-30</sup>. HUST makes it possible to substantially reduce the data acquisition time without compromising CBF quantification quality. HUST outperforms three state-of-the-art for both 2D and 3D ASL perfusion MRI data, achieving higher mean PSNR (45.15 for 3D, 33.67 for 2D) and SSIM (0.99 for 3D, 0.96 for 2D), indicating superior image quality and closer resemblance to the reference image.</p>","PeriodicalId":49376,"journal":{"name":"Visual Computer","volume":" ","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12366763/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Computer","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00371-025-04061-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Arterial Spin Labeling (ASL) perfusion MRI is the only non-invasive technique for quantifying regional cerebral blood flow (CBF) visualization, which is an important physiological variable. ASL MRI has a relatively low signal-to-noise-ratio (SNR), making it challenging to achieve high quality CBF images using limited data. Promising ASL CBF denoising results have been shown in recent convolutional neural network (CNN)-based methods. A common problem of these methods is the loss of output image texture and image intensity variabilities. To address this problem, we proposed a Hybrid U-Net and Swin Transformer (HUST) ASL CBF denoising method. Transformers explicitly encode spatial positions of input data and can learn features with long-range dependency. These features can substantially mitigate the image blurring issue and preserve individual data variability. We used the U-Net as the network backbone due to its demonstrated capability for capturing local and global features and replaced the original CNNs layers with transformers. Swin Transformer was used to reduce the number of parameters required by a regular transformer for image denoising. Reduction in parameters is achieved by hierarchical structure along with shifting window-based attention mechanism. The proposed method is trained and tested with 2D and 3D ASL CBF images, HUST substantially improved CBF image visualization and preserved image textures. The 2D data were acquired from 277 normal healthy subjects aged 23 to 47, 110 males, and 167 females were included. The 3D data (110 subjects) were pooled from a local database and were acquired using our background suppressed 3D stack of spirals fast spin echo pseudo-continuous ASL sequence 27-30. HUST makes it possible to substantially reduce the data acquisition time without compromising CBF quantification quality. HUST outperforms three state-of-the-art for both 2D and 3D ASL perfusion MRI data, achieving higher mean PSNR (45.15 for 3D, 33.67 for 2D) and SSIM (0.99 for 3D, 0.96 for 2D), indicating superior image quality and closer resemblance to the reference image.
期刊介绍:
The Visual Computer publishes articles on all research fields of capturing, recognizing, modelling, analysing and generating shapes and images. It includes image understanding, machine learning for graphics and 3D fabrication.
3D Reconstruction
Computer Animation
Computational Fabrication
Computational Geometry
Computational Photography
Computer Vision for Computer Graphics
Data Compression for Graphics
Geometric Modelling
Geometric Processing
HCI and Computer Graphics
Human Modelling
Image Analysis
Image Based Rendering
Image Processing
Machine Learning for Graphics
Medical Imaging
Pattern Recognition
Physically Based Modelling
Illumination and Rendering Methods
Robotics and Vision
Saliency Methods
Scientific Visualization
Shape and Surface Modelling
Shape Analysis and Image Retrieval
Shape Matching
Sketch-based Modelling
Solid Modelling
Stylized rendering
Textures
Virtual and Augmented Reality
Visual Analytics
Volume Rendering
All papers are subject to thorough review and, if accepted, will be revised accordingly.
Original contributions, describing advances in the theory in the above mentioned fields as well as practical results and original applications, are invited.