Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao
{"title":"Contrastive Graph Modeling for Cross-Domain Few-Shot Medical Image Segmentation.","authors":"Yuntian Bo, Tao Zhou, Zechao Li, Haofeng Zhang, Ling Shao","doi":"10.1109/TMI.2025.3649239","DOIUrl":"10.1109/TMI.2025.3649239","url":null,"abstract":"<p><p>Cross-domain few-shot medical image segmentation (CD-FSMIS) offers a promising and data-efficient solution for medical applications where annotations are severely scarce and multimodal analysis is required. However, existing methods typically filter out domain-specific information to improve generalization, which inadvertently limits cross-domain performance and degrades source-domain accuracy. To address this, we present Contrastive Graph Modeling (C-Graph), a framework that leverages the structural consistency of medical images as a reliable domain-transferable prior. We represent image features as graphs, with pixels as nodes and semantic affinities as edges. A Structural Prior Graph (SPG) layer is proposed to capture and transfer target-category node dependencies and enable global structure modeling through explicit node interactions. Building upon SPG layers, we introduce a Subgraph Matching Decoding (SMD) mechanism that exploits semantic relations among nodes to guide prediction. Furthermore, we design a Confusion-minimizing Node Contrast (CNC) loss to mitigate node ambiguity and subgraph heterogeneity by contrastively enhancing node discriminability in the graph space. Our method significantly outperforms prior CD-FSMIS approaches across multiple cross-domain benchmarks, achieving state-of-the-art performance while simultaneously preserving strong segmentation accuracy on the source domain. Our code is available at https://github.com/primebo1/C-Graph.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":"2173-2186"},"PeriodicalIF":0.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145859678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang
{"title":"Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-Ray: Summary of the PENGWIN 2024 Challenge.","authors":"Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Plotka, Rafal Litka, Gang Zhu, Yingchun Song, Mathias Unberath, Mehran Armand, Dan Ruan, S Kevin Zhou, Qiyong Cao, Chunpeng Zhao, Xinbao Wu, Yu Wang","doi":"10.1109/TMI.2025.3650126","DOIUrl":"10.1109/TMI.2025.3650126","url":null,"abstract":"<p><p>The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture segmentation by benchmarking state-of-the-art algorithms on these complex tasks. A diverse dataset of 150 CT scans was collected from multiple clinical centers, and a large set of simulated X-ray images was generated using the DeepDRR method. Final submissions from 16 teams worldwide were evaluated under a rigorous multi-metric testing scheme. The top-performing CT algorithm achieved an average fragment-wise intersection over union (IoU) of 0.930, demonstrating satisfactory accuracy. However, in the X-ray task, the best algorithm achieved an IoU of 0.774, which is promising but not yet sufficient for intra-operative decision-making, reflecting the inherent challenges of fragment overlap in projection imaging. Beyond the quantitative evaluation, the challenge revealed methodological diversity in algorithm design. Variations in instance representation, such as primary-secondary classification versus boundary-core separation, led to differing segmentation strategies. Despite promising results, the challenge also exposed inherent uncertainties in fragment definition, particularly in cases of incomplete fractures. These findings suggest that interactive segmentation approaches, integrating human decision-making with task-relevant information, may be essential for improving model reliability and clinical applicability.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":"2212-2228"},"PeriodicalIF":0.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145890767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones
{"title":"NeeCo: Image Synthesis of Novel Instrument States Based on Dynamic and Deformable 3-D Gaussian Reconstruction.","authors":"Tianle Zeng, Junlei Hu, Gerardo Loza Galindo, Sharib Ali, Duygu Sarikaya, Pietro Valdastri, Dominic Jones","doi":"10.1109/TMI.2025.3648299","DOIUrl":"10.1109/TMI.2025.3648299","url":null,"abstract":"<p><p>Computer vision-based technologies significantly enhance surgical automation by advancing tool tracking, detection, and localization. However, Current data-driven approaches are data-voracious, requiring large, high-quality labeled image datasets. Our Work introduces a novel dynamic Gaussian Splatting technique to address the data scarcity in surgical image datasets. We propose a dynamic Gaussian model to represent dynamic surgical scenes, enabling the rendering of surgical instruments from unseen viewpoints and deformations with real tissue backgrounds. We utilize a dynamic training adjustment strategy to address challenges posed by poorly calibrated camera poses from real-world scenarios. Additionally, automatically generate annotations for our synthetic data. For evaluation, we constructed a new dataset featuring seven scenes with 14,000 frames of tool and camera motion and tool jaw articulation, with a background of an ex-vivo porcine model. Using this dataset, we synthetically replicate the scene deformation from the ground truth data, allowing direct comparisons of synthetic image quality. Experimental results illustrate that our method generates photo-realistic labeled image datasets with the highest PSNR (29.87). We further evaluate the performance of medical-specific neural networks trained on real and synthetic images using an unseen real-world image dataset. Our results show that the performance of models trained on synthetic images generated by the proposed method outperforms those trained with state-of-the-art standard data augmentation by 10%, leading to an overall improvement in model performances by nearly 15%.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":"2100-2112"},"PeriodicalIF":0.0,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengliang Liu, Yuanxi Que, Wai Keung Wong, Yabo Liu, Xiaoling Luo
{"title":"Multi-view Hilbert Curve-based Hierarchical Information Aggregation for Incomplete Multimodal Alzheimer's Disease Diagnosis.","authors":"Chengliang Liu, Yuanxi Que, Wai Keung Wong, Yabo Liu, Xiaoling Luo","doi":"10.1109/TMI.2026.3689332","DOIUrl":"https://doi.org/10.1109/TMI.2026.3689332","url":null,"abstract":"<p><p>Timely identification of Alzheimer's disease (AD) benefits from combining neuroimaging, fluid biomarkers, and cognitive assessments, yet in practice one or more modalities are often unavailable due to various factors such as cost, patient compliance, and procedural risks. Furthermore, conventional convolutional neural network (CNN) architectures and even Transformer-based models struggle to efficiently capture both local and global dependencies, especially when dealing with high-dimensional and highly heterogeneous medical data. In this study, we introduce a novel hierarchical information aggregation and dynamic fusion (HI-AD) framework for incomplete multi-modal AD diagnosis. Our method couples a multi-view Hilbert curve-guided Mamba block with hierarchical spatial feature extraction to retain spatial continuity, model long-range dependencies, and integrate local context in neuroimaging data. To balance semantic alignment and modality-specific information, we propose a unified mutual information-driven learning objective with an active confidence evaluation mechanism, thereby preventing modality collapse and promoting robust representation learning. Extensive experiments on real-world datasets validate that our HI-AD framework consistently outperforms existing state-of-the-art methods across a diverse range of modality-missing scenarios, establishing an effective and generalizable solution for early-stage AD screening in heterogeneous clinical data environments.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147824990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating Anatomical Priors into a Causal Diffusion Model.","authors":"Binxu Li, Wei Peng, Mingjie Li, Ehsan Adeli, Kilian M Pohl","doi":"10.1109/TMI.2026.3688515","DOIUrl":"https://doi.org/10.1109/TMI.2026.3688515","url":null,"abstract":"<p><p>3D brain MRI studies often examine subtle morphometric differences between cohorts that are hard to detect visually. Given the high cost of MRI acquisition, these studies could greatly benefit from image syntheses, particularly counterfactual image generation, as has been the case for applications in computer vision. However, counterfactual models struggle to produce anatomically plausible MRIs due to a lack of explicit inductive biases to preserve fine-grained anatomical details. This short-coming arises from the training of models that optimize overall image appearance (e.g., via cross-entropy) rather than preserving subtle, yet medically relevant, local variations across subjects. To preserve subtle variations, we propose to explicitly integrate anatomical constraints at the voxel level as priors into a generative diffusion framework. Termed Probabilistic Causal Graph Model (PCGM), the approach captures anatomical constraints via a probabilistic graph module and translates those constraints into spatial binary masks of regions where subtle variations occur. The masks (encoded by a 3D ControlNet) constrain a novel counterfactual denoising UNet, whose encodings are then transferred into high-quality brain MRIs via our 3D diffusion decoder. Extensive experiments across multiple datasets demonstrate that PCGM generates structural brain MRIs of higher quality than several baseline approaches. Furthermore, we show, for the first time, that brain measurements extracted from counterfactuals (generated by PCGM) replicate the subtle effects of a disease on cortical brain regions previously reported in the neuroscience literature. This achievement is an important milestone in the use of synthetic MRIs in studies investigating subtle morphological differences. The codes are available at https://github.com/AndyCA111/PCGM.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147793025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Prasun C Tripathi, Sina Tabakhi, Mohammod N I Suvon, Lawrence Schob, Samer Alabed, Andrew J Swift, Shuo Zhou, Haiping Lu
{"title":"Interpretable Multimodal Learning for Cardiovascular Hemodynamics Assessment.","authors":"Prasun C Tripathi, Sina Tabakhi, Mohammod N I Suvon, Lawrence Schob, Samer Alabed, Andrew J Swift, Shuo Zhou, Haiping Lu","doi":"10.1109/TMI.2026.3681722","DOIUrl":"https://doi.org/10.1109/TMI.2026.3681722","url":null,"abstract":"<p><p>Pulmonary Arterial Wedge Pressure (PAWP) is an essential cardiovascular hemodynamics marker to detect heart failure. In clinical practice, Right Heart Catheterization is considered a gold standard for assessing cardiac hemodynamics while non-invasive methods are often needed to screen high-risk patients from a large population. In this paper, we propose a multimodal learning pipeline to predict PAWP marker. We utilize complementary information from Cardiac Magnetic Resonance Imaging (CMR) scans (short-axis and four-chamber) and Electronic Health Records (EHRs). We extract spatio-temporal features from CMR scans using tensor-based learning. We propose a graph attention network to select important EHR features for prediction, where we model subjects as graph nodes and feature relationships as graph edges using the attention mechanism. We design four feature fusion strategies: early, intermediate, late, and hybrid fusion. With a linear classifier and linear fusion strategies, our pipeline is interpretable. We validate our pipeline on a large dataset of 2, 641 subjects from our ASPIRE registry. The comparative study against state-of-the-art methods confirms the superiority of our pipeline. The decision curve analysis further validates that our pipeline can be applied to screen a large population. The code is available at https://github.com/prasunc/ hemodynamics.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147640980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Changjie Lu, Sourya Sengupta, Hua Li, Mark A Anastasio
{"title":"Observer-usable Information as a Task-specific Image Quality Metric.","authors":"Changjie Lu, Sourya Sengupta, Hua Li, Mark A Anastasio","doi":"10.1109/TMI.2026.3680092","DOIUrl":"https://doi.org/10.1109/TMI.2026.3680092","url":null,"abstract":"<p><p>Objective, task-based measures of image quality (IQ) have been widely advocated for assessing and optimizing medical imaging technologies. Besides signal detection theory-based measures, information-theoretic quantities have been proposed to quantify task-based IQ. For example, task-specific information (TSI), defined as the mutual information between an image and a task variable, represents an optimal measure of how informative an image is for performing a specified task. However, like the ideal observer from signal detection theory, TSI does not quantify the amount of task-relevant information in an image that can be exploited by a sub-ideal observer. A recently proposed relaxation of TSI, termed predictive V-information (V-info), removes this limitation and can quantify the utility of an image with consideration of a specified family of sub-ideal observers. In this study, for the first time, we introduce and investigate V-info as an objective, task-specific IQ metric. To corroborate its usefulness, a stylized magnetic resonance image restoration problem is considered in which V-info is employed to quantify signal detection or discrimination performance. The presented experiments show that, for binary classification tasks, V-info varies consistently with the area under the receiver operating characteristic (ROC) curve in regimes where class separability changes with observer capacity or imaging conditions. However, unlike AUC, V-info remains sensitive in regimes where discrimination performance approaches saturation. In addition, V-info is readily applicable to multi-class (> 2) tasks where ROC analysis is less natural. These findings suggest that V-info can serve as a complementary task-based image quality measure alongside traditional signal detection theory-based metrics.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147635547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A 3D Cross-modal Keypoint Descriptor for MR-US Matching and Registration.","authors":"Daniil Morozov, Reuben Dorent, Nazim Haouchine","doi":"10.1109/TMI.2026.3680352","DOIUrl":"https://doi.org/10.1109/TMI.2026.3680352","url":null,"abstract":"<p><p>Intraoperative registration of real-time ultra-sound (iUS) to preoperative Magnetic Resonance Imaging (MRI) remains an unsolved problem due to severe modality-specific differences in appearance, resolution, and field-of-view. To address this, we propose a novel 3D cross-modal keypoint descriptor for MRI-iUS matching and registration. Our approach employs a patient-specific matching-by-synthesis approach, generating synthetic iUS volumes from preoperative MRI. This enables supervised contrastive training to learn a shared descriptor space. A probabilistic keypoint detection strategy is then employed to identify anatomically salient and modality-consistent locations. During training, a curriculum-based triplet loss with dynamic hard negative mining is used to learn descriptors that are i) robust to iUS artifacts such as speckle noise and limited coverage, and ii) rotation-invariant. At inference, the method detects keypoints in MR and real iUS images and identifies sparse matches, which are then used to perform rigid registration. Our approach is evaluated using 3D MRI-iUS pairs from the ReMIND dataset. Experiments show that our approach outperforms state-of-the-art keypoint matching methods across 11 patients, with an average precision of 69.8%. For image registration, our method achieves a competitive mean Target Registration Error of 2.39 mm on the ReMIND2Reg benchmark. Compared to existing iUS-MR registration approaches, our framework is interpretable, requires no manual initialization, and shows robustness to iUS field-of-view variation. Code, data and model weights are available at https://github.com/ morozovdd/CrossKEY.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2026-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147617268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siyang Feng;Hualong Zhang;Xianjing Zhao;Liting Shi;Zhenbing Liu;Rushi Lan;Lei Shi;Xipeng Pan
{"title":"Wave-Aware Weakly Supervised Histopathological Tissue Segmentation With Cross-Scale Logits Distillation","authors":"Siyang Feng;Hualong Zhang;Xianjing Zhao;Liting Shi;Zhenbing Liu;Rushi Lan;Lei Shi;Xipeng Pan","doi":"10.1109/TMI.2025.3637119","DOIUrl":"10.1109/TMI.2025.3637119","url":null,"abstract":"Weakly supervised learning based on image-level labels can effectively reduce annotation costs, making it a popular choice for histopathological tissue segmentation. However, this pattern still face some challenges: 1) inaccurate class activation maps (CAM) make pseudo masks quality insufficient; 2) noisy pixels in pseudo masks will mislead the segmentation model’s decision-making. To deal with these problems, we propose a novel weakly supervised semantic segmentation (WSSS) framework. First, we introduce Local Spatial Affine Perturbation to strengthen the model’s utilization of weak supervision signals and improve its robustness to noisy regions within CAM. Second, we propose Wave-aware Dynamic Feature Aggregation to adaptively enhance the information-aware representation of target regions to obtain fine-grained pseudo masks enriched with positive semantic information. Third, we train a segmentation model with a noise-suppression scheme called Cross-scale Logits Distillation to reduce the inevitable false positive pixels in pseudo masks. We conduct extensive experiments to validate our method and set new state-of-the-art segmentation performances on five histopathological tissue segmentation datasets. Moreover, we will introduce a new dataset, GCSS-WSSS for gastric cancer, to promote the diversification for the research community of computational pathology. Code and data will be released at: <uri>https://github.com/director87/WaWeHis</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 4","pages":"1698-1710"},"PeriodicalIF":0.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145599894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extractive Radiology Reporting With Memory-Based Cross-Modal Representations","authors":"Yuanhe Tian;Zexuan Yan;Nenan Lyu;Yan Song","doi":"10.1109/TMI.2025.3636868","DOIUrl":"10.1109/TMI.2025.3636868","url":null,"abstract":"Radiology report generation (RRG) produces detailed textual descriptions for radiographs, serving as a crucial task for medical analysis and diagnosis. Most existing RRG approaches naturally follow the multimodal text generation paradigm, where autoregressive models are utilized to perform token-by-token report generation and thus are potentially risky in generating invalid content while being limited in low information processing speed. Although advanced architectures, such as pre-trained models and large language models (LLMs), are applied for RRG and achieve good performance, they still face the aforementioned risk and speed limitation, especially that LLMs may introduce hallucinations. Consider that radiology reports are highly patternized, sentences in them convey specific meanings independently and are frequently reused, we propose a new extractive radiograph reporting (ERR) workflow and design a dedicated framework that efficiently and accurately extracts appropriate sentences from existing radiological cases for report generation. Our approach employs a memory module to store important medical information and enhance the encoding for input radiograph with better cross-modal representations, which are used to match sentences for the extraction process. We conducted experiments on two widely used benchmark datasets, with the results demonstrating that our approach outperforms strong baselines and achieves comparable results with existing state-of-the-art generative models. Analyses further confirm that our ERR approach not only produces reports with reliable content but also ensures high training and inference efficiency.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"45 4","pages":"1686-1697"},"PeriodicalIF":0.0,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145599302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}