{"title":"Spatio-Temporal and Retrieval-Augmented Modeling for Chest X-Ray Report Generation","authors":"Yan Yang;Xiaoxing You;Ke Zhang;Zhenqi Fu;Xianyun Wang;Jiajun Ding;Jiamei Sun;Zhou Yu;Qingming Huang;Weidong Han;Jun Yu","doi":"10.1109/TMI.2025.3554498","DOIUrl":"10.1109/TMI.2025.3554498","url":null,"abstract":"Chest X-ray report generation has attracted increasing research attention. However, most existing methods neglect the temporal information and typically generate reports conditioned on a fixed number of images. In this paper, we propose STREAM: Spatio-Temporal and REtrieval-Augmented Modelling for automatic chest X-ray report generation. It mimics clinical diagnosis by integrating current and historical studies to interpret the present condition (temporal), with each study containing images from multi-views (spatial). Concretely, our STREAM is built upon an encoder-decoder architecture, utilizing a large language model (LLM) as the decoder. Overall, spatio-temporal visual dynamics are packed as visual prompts and regional semantic entities are retrieved as textual prompts. First, a token packer is proposed to capture condensed spatio-temporal visual dynamics, enabling the flexible fusion of images from current and historical studies. Second, to augment the generation with existing knowledge and regional details, a progressive semantic retriever is proposed to retrieve semantic entities from a preconstructed knowledge bank as heuristic text prompts. The knowledge bank is constructed to encapsulate anatomical chest X-ray knowledge into structured entities, each linked to a specific chest region. Extensive experiments on public datasets have shown the state-of-the-art performance of our method. Related codes and the knowledge bank are available at <uri>https://github.com/yangyan22/STREAM</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"2892-2905"},"PeriodicalIF":0.0,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143712667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-Supervised Feature Learning for Cardiac Cine MR Image Reconstruction","authors":"Siying Xu;Marcel Früh;Kerstin Hammernik;Andreas Lingg;Jens Kübler;Patrick Krumm;Daniel Rueckert;Sergios Gatidis;Thomas Küstner","doi":"10.1109/TMI.2025.3570226","DOIUrl":"10.1109/TMI.2025.3570226","url":null,"abstract":"We propose a self-supervised feature learning assisted reconstruction (SSFL-Recon) framework for MRI reconstruction to address the limitation of existing supervised learning methods. Although recent deep learning-based methods have shown promising performance in MRI reconstruction, most require fully-sampled images for supervised learning, which is challenging in practice considering long acquisition times under respiratory or organ motion. Moreover, nearly all fully-sampled datasets are obtained from conventional reconstruction of mildly accelerated datasets, thus potentially biasing the achievable performance. The numerous undersampled datasets with different accelerations in clinical practice, hence, remain underutilized. To address these issues, we first train a self-supervised feature extractor on undersampled images to learn sampling-insensitive features. The pre-learned features are subsequently embedded in the self-supervised reconstruction network to assist in removing artifacts. Experiments were conducted retrospectively on an in-house 2D cardiac Cine dataset, including 91 cardiovascular patients and 38 healthy subjects. The results demonstrate that the proposed SSFL-Recon framework outperforms existing self-supervised MRI reconstruction methods and even exhibits comparable or better performance to supervised learning up to <inline-formula> <tex-math>${16}times $ </tex-math></inline-formula> retrospective undersampling. The feature learning strategy can effectively extract global representations, which have proven beneficial in removing artifacts and increasing generalization ability during reconstruction.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3858-3869"},"PeriodicalIF":0.0,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144130352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew Tivnan;Jacopo Teneggi;Tzu-Cheng Lee;Ruoqiao Zhang;Kirsten Boedeker;Liang Cai;Grace J. Gang;Jeremias Sulam;J. Webster Stayman
{"title":"Fourier Diffusion Models: A Method to Control MTF and NPS in Score-Based Stochastic Image Generation","authors":"Matthew Tivnan;Jacopo Teneggi;Tzu-Cheng Lee;Ruoqiao Zhang;Kirsten Boedeker;Liang Cai;Grace J. Gang;Jeremias Sulam;J. Webster Stayman","doi":"10.1109/TMI.2025.3553805","DOIUrl":"10.1109/TMI.2025.3553805","url":null,"abstract":"Score-based diffusion models are new and powerful tools for image generation. They are based on a forward stochastic process where an image is degraded with additive white noise and optional input scaling. A neural network can be trained to estimate the time-dependent score function, and used to run the reverse-time stochastic process to generate new samples from the training image distribution. However, one issue is that sampling the reverse process requires many passes of the neural network. In this work we present Fourier Diffusion Models which replace the scalar operations of the forward process with linear shift invariant systems and additive spatially-stationary noise. This allows for a model of continuous probability flow from true images to measurements with a specific modulation transfer function (MTF) and noise power spectrum (NPS). We also derive the reverse process for posterior sampling of high-quality images given blurry noisy measurements. We conducted a computational experiment using the Lung Image Database Consortium dataset of chest CT images and simulated CT measurements with correlated noise and system blur. Our results show that Fourier diffusion models can improve image quality for supervised diffusion posterior sampling relative to existing conditional diffusion models.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3694-3704"},"PeriodicalIF":0.0,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143672281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SATO: Straighten Any 3D Tubular Object","authors":"Yanfeng Zhou;Jiaheng Zhou;Zichen Wang;Ge Yang","doi":"10.1109/TMI.2025.3571896","DOIUrl":"10.1109/TMI.2025.3571896","url":null,"abstract":"3D tubular objects have complex spatial shapes. Direct volume visualization cannot intuitively display their morphological characteristics and surface abnormalities. Straightening reformation is an effective visualization method for tubular objects. It uses a swept frame to sample cross sections along the centerline of the tubular object to generate straightening result. So far, however, current methods cannot visualize the full 3D view and fail to interface with downstream morphological analysis. Furthermore, current swept frames impose strict restrictions on the shape of tubular objects and are computationally expensive. In this study, we propose a novel swept frame based on vector rotation and construct an automatic straightening reformation pipeline. Our method is applicable to various tubular objects and can be efficiently executed recursively while ensuring that the straightening results have no rotation bias. Extensive experiments on eight different tubular objects and quantitative evaluation on various downstream applications demonstrate the effectiveness and universality of our straightening pipeline. Code is available at <uri>https://github.com/Yanfeng-Zhou/SATO</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3882-3891"},"PeriodicalIF":0.0,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144103738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diogo F. Silva;Sebastian Reinartz;Thomas Muders;Karin Wodack;Christian Putensen;Robert Siepmann;Benjamin Hentze;Steffen Leonhardt
{"title":"Contrast-Enhanced EIT Robustly Tracks Regional Lung Perfusion Compared to Non-Enhanced EIT and Pulmonary CT","authors":"Diogo F. Silva;Sebastian Reinartz;Thomas Muders;Karin Wodack;Christian Putensen;Robert Siepmann;Benjamin Hentze;Steffen Leonhardt","doi":"10.1109/TMI.2025.3552037","DOIUrl":"10.1109/TMI.2025.3552037","url":null,"abstract":"Regional perfusion monitoring, often performed by pulmonary perfusion computed tomography, is vital in intensive care units. Electrical impedance tomography, repeatable and of non-invasive nature, could provide an attractive alternative. This study compares non-enhanced and contrast-enhanced electrical impedance tomography to computed tomography under induced central to peripheral lung perfusion impairments and cardiac output modulation in 11 animals. A new algorithmic framework using multi-compartment modeling and tracer kinetics was developed to improve perfusion estimation. A multi-resolution mixed models analysis shows electrical impedance tomography agrees poorly with computed tomography in static monitoring, with limits of agreement exceeding relative errors of 100%. For trend tracking,contrast-enhancement with 5.85% NaCl yielded concordance rates above 80%, and over 90% for peripheral impairments, emerging as a robust tracker of coarse to fine perfusion changes. Non-enhanced electrical impedance tomography peaked around 60% under central impairment and cardiac modulation, proving to be less reliable.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3683-3693"},"PeriodicalIF":0.0,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xu Tang;Jiangbo Chen;Zheng Qu;Jingyi Zhu;Mohammadreza Amjadian;Mingxuan Yang;Yingpeng Wan;Lidai Wang
{"title":"High Sensitivity Photoacoustic Imaging by Learning From Noisy Data","authors":"Xu Tang;Jiangbo Chen;Zheng Qu;Jingyi Zhu;Mohammadreza Amjadian;Mingxuan Yang;Yingpeng Wan;Lidai Wang","doi":"10.1109/TMI.2025.3552692","DOIUrl":"10.1109/TMI.2025.3552692","url":null,"abstract":"Photoacoustic imaging (PAI) is a high-resolution biomedical imaging technology for the non-invasive detection of a broad range of chromophores at multiple scales and depths. However, limited by low chromophore concentration, weak signals in deep tissue, or various noise, the signal-to-noise ratio of photoacoustic images may be compromised in many biomedical applications. Although improvements in hardware and computational methods have been made to address this problem, they have not been readily available due to either high costs or an inability to generalize across different datasets. Here, we present a self-supervised deep learning method to increase the signal-to-noise ratio of photoacoustic images using noisy data only. Because this method does not require expensive ground truth data for training, it can be easily implemented across scanning microscopic and computed tomographic data acquired with various photoacoustic imaging systems. In vivo results show that our method makes the vascular details that were completely submerged in noise become clearly visible, increases the signal-to-noise ratio by up to 12-fold, doubles the imaging depth, and enables high-contrast imaging of deep tumors. We believe this method can be readily applied to many preclinical and clinical applications.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"2868-2877"},"PeriodicalIF":0.0,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143661336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Replace2Self: Self-Supervised Denoising Based on Voxel Replacing and Image Mixing for Diffusion MRI","authors":"Linhai Wu;Lihui Wang;Zeyu Deng;Yuemin Zhu;Hongjiang Wei","doi":"10.1109/TMI.2025.3552611","DOIUrl":"10.1109/TMI.2025.3552611","url":null,"abstract":"Low signal to noise ratio (SNR) remains one of the limitations of diffusion weighted (DW) imaging. How to suppress the influence of noise on the subsequent analysis about the tissue microstructure is still challenging. This work proposed a novel self-supervised learning model, Replace2Self, to effectively reduce spatial correlated noise in DW images. Specifically, a voxel replacement strategy based on similar block matching in Q-space was proposed to destroy the correlations of noise in DW image along one diffusion gradient direction. To alleviate the signal gap caused by the voxel replacement, an image mixing strategy based on complementary mask was designed to generate two different noisy DW images. After that, these two noisy DW images were taken as input, and the non-correlated noisy DW image after voxel replacement was taken as learning target, a denoising network was trained for denoising. To promote the denoising performance, a complementary mask mixing consistency loss and an inverse replacement regularization loss were also proposed. Through the comparisons against several existing DW image denoising methods on extensive simulation data with different noise distributions, noise levels and b-values, as well as the acquisition datasets and the ablation experiments, we verified the effectiveness of the proposed method. Regardless of the noise distribution and noise level, the proposed method achieved the highest PSNR, which was at least 1.9% higher than the suboptimal method when the noise level reaches 10%. Furthermore, our method has superior generalization ability due to the use of the proposed strategies.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"2878-2891"},"PeriodicalIF":0.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143660090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuhua Wu;Hongzhi Zuo;Manxiu Cui;Handi Deng;Yuwen Chen;Xuanhao Wang;Bangyan Wang;Cheng Ma
{"title":"Blood Oxygenation Quantification in Multispectral Photoacoustic Tomography Using a Convex Cone Approach","authors":"Chuhua Wu;Hongzhi Zuo;Manxiu Cui;Handi Deng;Yuwen Chen;Xuanhao Wang;Bangyan Wang;Cheng Ma","doi":"10.1109/TMI.2025.3551744","DOIUrl":"10.1109/TMI.2025.3551744","url":null,"abstract":"Multispectral photoacoustic tomography (PAT) can create high spatial and temporal resolution images of oxygen saturation (sO2) distribution in deep tissue. However, unknown distributions of photon absorption and scattering introduces complex modulations to the photoacoustic (PA) spectra, dramatically reducing the accuracy of sO2 quantification. In this study, a rigorous light transport model was employed to unveil that the PA spectra corresponding to distinct sO2 values can be constrained within separate convex cones (CCs). Based on the CC model, sO2 estimation is achieved by identifying the CC nearest to the measured data through a modified Gilbert-Johnson-Keerthi (GJK) algorithm. The CC method combines a rigorous physical model with data-driven approach, and shows outstanding robustness in numerical, phantom, and in vivo imaging experiments validated against ground truth measurements. The average sO2 estimation error is approximately only 3% in in vivo human experiments, underscoring its potential for clinical application. All of our computer codes and data are publicly available on GitHub.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"2842-2853"},"PeriodicalIF":0.0,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143652826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiang Zeng;Shengwu Xiong;Jinming Xu;Guangxing Du;Yi Rong
{"title":"Uncertainty Co-Estimator for Improving Semi-Supervised Medical Image Segmentation","authors":"Xiang Zeng;Shengwu Xiong;Jinming Xu;Guangxing Du;Yi Rong","doi":"10.1109/TMI.2025.3570310","DOIUrl":"10.1109/TMI.2025.3570310","url":null,"abstract":"Recently, combining the strategy of consistency regularization with uncertainty estimation has shown promising performance on semi-supervised medical image segmentation tasks. However, most existing methods estimate the uncertainty solely based on the outputs of a single neural network, which results in imprecise uncertainty estimations and eventually degrades the segmentation performance. In this paper, we propose a novel Uncertainty Co-estimator (UnCo) framework to deal with this problem. Inspired by the co-training technique, UnCo establishes two different mean-teacher modules (i.e., two pairs of teacher and student models), and estimates three types of uncertainty from the multi-source predictions generated by these models. Through combining these uncertainties, their differences will help to filter out incorrect noise in each estimate, thus allowing the final fused uncertainty maps to be more accurate. These resulting maps are then used to enhance a cross-consistency regularization imposed between the two modules. In addition, UnCo also designs an internal consistency regularization within each module, so that the student models can aggregate diverse feature information from both modules, thus promoting the semi-supervised segmentation performance. Finally, an adversarial constraint is introduced to maintain the model diversity. Experimental results on four medical image datasets indicate that UnCo can achieve new state-of-the-art performance on both 2D and 3D semi-supervised segmentation tasks. The source code will be available at <uri>https://github.com/z1010x/UnCo</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 9","pages":"3870-3881"},"PeriodicalIF":0.0,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144065789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DenseFormer-MoE: A Dense Transformer Foundation Model with Mixture of Experts for Multi-Task Brain Image Analysis.","authors":"Rizhi Ding, Hui Lu, Manhua Liu","doi":"10.1109/TMI.2025.3551514","DOIUrl":"10.1109/TMI.2025.3551514","url":null,"abstract":"<p><p>Deep learning models have been widely investigated for computing and analyzing brain images across various downstream tasks such as disease diagnosis and age regression. Most existing models are tailored for specific tasks and diseases, posing a challenge in developing a foundation model for diverse tasks. This paper proposes a Dense Transformer Foundation Model with Mixture of Experts (DenseFormer-MoE), which integrates dense convolutional network, Vision Transformer and Mixture of Experts (MoE) to progressively learn and consolidate local and global features from T1-weighted magnetic resonance images (sMRI) for multiple tasks including diagnosing multiple brain diseases and predicting brain age. First, a foundation model is built by combining the vision Transformer with Densenet, which are pre-trained with Masked Autoencoder and self-supervised learning to enhance the generalization of feature representations. Then, to mitigate optimization conflicts in multi-task learning, MoE is designed to dynamically select the most appropriate experts for each task. Finally, our method is evaluated on multiple renowned brain imaging datasets including UK Biobank (UKB), Alzheimer's Disease Neuroimaging Initiative (ADNI), and Parkinson's Progression Markers Initiative (PPMI). Experimental results and comparison demonstrate that our method achieves promising performances for prediction of brain age and diagnosis of brain diseases.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143631136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}