Xiaoyue Li , Tielong Cai , Jun Dan , Sizhao Ma , Kai Shang , Mark D. Butala , Gaoang Wang
{"title":"SNAFusion-MM: Distilling sparse sampled measurements by 2D axial diffusion priors with multi-step matching for 3D inverse problem","authors":"Xiaoyue Li , Tielong Cai , Jun Dan , Sizhao Ma , Kai Shang , Mark D. Butala , Gaoang Wang","doi":"10.1016/j.inffus.2025.103323","DOIUrl":null,"url":null,"abstract":"<div><div>Reconstructing 3D volumes with inner details from sparse measurements remains a critical challenge in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). Existing data-driven 3D decoders suffer from limited generalizability and recent diffusion models most remain restricted to 2D domains due to prohibitive memory demands that hinder 3D utilization. Although implicit neural rendering (INR)s develop 3D representations, they frequently struggle to maintain reconstruction fidelity under extremely sparse view conditions. We propose SNAFusion-MM,a framework that unifies 2D axial diffusion priors, geometric constraint from physical operators, and multi-step distillation strategy for Sparse measured 3D medical reconstruction. Unlike conventional score distillation sampling, ourframework distills robust prior knowledge from pre-trained 2D prior within deterministic DDIM trajectories and incorporates plug-and-play geometric information related to the measured process to refine the global coherent 3Dneural radiance field, eliminating the over-smoothing artifacts of single-step SDS while preserving 3D consistency and anatomical details. We conducted experiments on challenging in-/out-of-distribution datasets under a single GPU without any retraining. Quantitative and qualitative assessments demonstrate that SNAFusion-MM outperforms the recent works and also exhibit its superior generalizability, especially including extremely sparse-view cone-beam CT (CBCT), X-ray novel-view synthesis (NVS) from sparse sampled CBCT, and radial sampled compressed sensing MRI (CS-MRI) tasks, which cannot yet be well handled by state-of-the-art (SOTA)s.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"124 ","pages":"Article 103323"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525003963","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Reconstructing 3D volumes with inner details from sparse measurements remains a critical challenge in Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). Existing data-driven 3D decoders suffer from limited generalizability and recent diffusion models most remain restricted to 2D domains due to prohibitive memory demands that hinder 3D utilization. Although implicit neural rendering (INR)s develop 3D representations, they frequently struggle to maintain reconstruction fidelity under extremely sparse view conditions. We propose SNAFusion-MM,a framework that unifies 2D axial diffusion priors, geometric constraint from physical operators, and multi-step distillation strategy for Sparse measured 3D medical reconstruction. Unlike conventional score distillation sampling, ourframework distills robust prior knowledge from pre-trained 2D prior within deterministic DDIM trajectories and incorporates plug-and-play geometric information related to the measured process to refine the global coherent 3Dneural radiance field, eliminating the over-smoothing artifacts of single-step SDS while preserving 3D consistency and anatomical details. We conducted experiments on challenging in-/out-of-distribution datasets under a single GPU without any retraining. Quantitative and qualitative assessments demonstrate that SNAFusion-MM outperforms the recent works and also exhibit its superior generalizability, especially including extremely sparse-view cone-beam CT (CBCT), X-ray novel-view synthesis (NVS) from sparse sampled CBCT, and radial sampled compressed sensing MRI (CS-MRI) tasks, which cannot yet be well handled by state-of-the-art (SOTA)s.
期刊介绍:
Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.