{"title":"Few-Shot Learning for Annotation-Efficient Nucleus Instance Segmentation","authors":"Yu Ming;Zihao Wu;Jie Yang;Danyi Li;Yuan Gao;Changxin Gao;Gui-Song Xia;Yuanqing Li;Li Liang;Jin-Gang Yu","doi":"10.1109/TMI.2025.3564458","DOIUrl":"10.1109/TMI.2025.3564458","url":null,"abstract":"Nucleus instance segmentation from histopathology images suffers from the extremely laborious and expert-dependent annotation of nucleus instances. As a promising solution to this task, annotation-efficient deep learning paradigms have recently attracted much research interest, such as weakly-/semi-supervised learning, generative adversarial learning, etc. In this paper, we propose to formulate annotation-efficient nucleus instance segmentation from the perspective of few-shot learning (FSL). Our work was motivated by that, with the prosperity of computational pathology, an increasing number of fully-annotated datasets are publicly accessible, and we hope to leverage these external datasets to assist nucleus instance segmentation on the target dataset which only has very limited annotation. To achieve this goal, we adopt the meta-learning based FSL paradigm, which however has to be tailored in two substantial aspects before adapting to our task. First, since the novel classes may be inconsistent with those of the external dataset, we extend the basic definition of few-shot instance segmentation (FSIS) to generalized few-shot instance segmentation (GFSIS). Second, to cope with the intrinsic challenges of nucleus segmentation, including touching between adjacent cells, cellular heterogeneity, etc., we further introduce a structural guidance mechanism into the GFSIS network, finally leading to a unified Structurally-Guided Generalized Few-Shot Instance Segmentation (SGFSIS) framework. Extensive experiments on a couple of publicly accessible datasets demonstrate that, SGFSIS can outperform other annotation-efficient learning baselines, including semi-supervised learning, simple transfer learning, etc., with comparable performance to fully supervised learning with around 10% annotations.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3311-3322"},"PeriodicalIF":0.0,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust Deep Convolutional Dictionary Model With Alignment Assistance for Multi-Contrast MRI Super-Resolution","authors":"Pengcheng Lei;Miaomiao Zhang;Faming Fang;Guixu Zhang","doi":"10.1109/TMI.2025.3563523","DOIUrl":"10.1109/TMI.2025.3563523","url":null,"abstract":"Multi-contrast magnetic resonance imaging (MCMRI) super-resolution (SR) methods aims to leverage the complementary information present in multi-contrast images. However, existing methods encounter several limitations. Firstly, most current networks fail to appropriately model the correlations of multi-contrast images and lack certain interpretability. Secondly, they often overlook the negative impact of spatial misalignment between modalities in clinical practice. Thirdly, existing methods do not effectively constrain the complementary information learned between multi-contrast images, resulting in information redundancy and limiting their model performance. In this paper, we propose a robust alignment-assisted multi-contrast convolutional dictionary (A2-CDic) model to address these challenges. Specifically, we develop an observation model based on convolutional sparse coding to explicitly represent multi-contrast images as common (e.g., consistent textures) and unique (e.g., inconsistent structures and contrasts) components. Considering there are spatial misalignments in real-world multi-contrast images, we incorporate a spatial alignment module to compensate for the misaligned structures. This approach enables the proposed model to fully exploit the valuable information in the reference image while mitigating interference from inconsistent information. We employ the proximal gradient algorithm to optimize the model and unroll the iterative steps into a multi-scale convolutional dictionary network. Furthermore, we utilize mutual information losses to constrain the extracted common and unique components. This constraint reduces the redundancy between the decomposed components, allowing each sub-module to learn more representative features. We evaluate our model on four publicly available datasets comprising internal, external, spatially aligned, and misaligned MCMRI images. The experimental results demonstrate that our model surpasses existing state-of-the-art MCMRI SR methods in terms of both generalization ability and overall performance. Code is available at <uri>https://github.com/lpcccc-cv/A2-CDic</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3383-3396"},"PeriodicalIF":0.0,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143866853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiuzhen Guo;Lianyuan Yu;Ji Shi;Hongxiao Wang;Jiangyuan Zhao;Rongguo Zhang;Hongwei Li;Na Lei
{"title":"Optimal Transport and Central Moment Consistency Regularization for Semi-Supervised Medical Image Segmentation","authors":"Xiuzhen Guo;Lianyuan Yu;Ji Shi;Hongxiao Wang;Jiangyuan Zhao;Rongguo Zhang;Hongwei Li;Na Lei","doi":"10.1109/TMI.2025.3563500","DOIUrl":"10.1109/TMI.2025.3563500","url":null,"abstract":"Semi-supervised learning leverages insights from unlabeled data to enhance generalizability of the model, thereby decreasing the dependence on extensive labeled datasets. Most existing methods overly focus on local representations while neglecting the learning of global structures. On the one hand, given that labeled and unlabeled images are presumed to originate from the same distribution, it is probable that similar regional features observed in both types of images correspond to the same label. Current label propagation techniques, which predominantly propagate label information through the construction of graph structures or similarity matrices, heavily depend on localized information and are prone to converge to local optima. In contrast, optimal transport considers the entire distribution. This facilitates more comprehensive and efficient label propagation. On the other hand, current consistency regularization-based methods focus on the local view, we believe learning from a global geometric view may capture more information. Geometric moment information of the sample itself can constrain the overall geometric structure. Inspired by these observations, this paper introduces a semi-supervised medical image segmentation framework that integrates optimal transport and central moment consistency regularization (OTCMC) from a global perspective. Firstly, we pass label information from labeled data to unlabeled data by optimal transport. Secondly, we incorporate central moment consistency regularization to focus the network on the geometric structure of images. Our method achieves the state-of-the-art (SOTA) performance on a series of datasets, including the NIH pancreas, left atrium, brain tumor, and skin lesion dermoscopy datasets.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3397-3409"},"PeriodicalIF":0.0,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143866852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinrui Yuan;Jiale Cheng;Fenqiang Zhao;Zhengwang Wu;Li Wang;Weili Lin;Yu Zhang;Ruiyuan Liu;Gang Li
{"title":"Flexible Individualized Developmental Prediction of Infant Cortical Surface Maps via Intensive Triplet Autoencoder","authors":"Xinrui Yuan;Jiale Cheng;Fenqiang Zhao;Zhengwang Wu;Li Wang;Weili Lin;Yu Zhang;Ruiyuan Liu;Gang Li","doi":"10.1109/TMI.2025.3562003","DOIUrl":"10.1109/TMI.2025.3562003","url":null,"abstract":"Computational methods for prediction of the dynamic and complex development of the infant cerebral cortex are critical and highly desired for a better understanding of early brain development in health and disease. Although a few methods have been proposed, they are limited to predicting cortical surface maps at predefined ages and require a large amount of strictly paired longitudinal data at these ages for model training. However, longitudinal infant images are typically acquired at highly irregular and nonuniform scanning ages, thus leading to limited training data for these methods and low flexibility and accuracy. To address these issues, we propose a flexible framework for individualized prediction of cortical surface maps at arbitrary ages during infancy. The central idea is that a cortical surface map can be considered as an entangled representation of two distinct components: 1) the identity-related invariant features, which preserve the individual identity and 2) the age-related features, which reflect the developmental patterns. Our framework, called intensive triplet autoencoder, extracts the mixed latent feature and further disentangles it into two components with an attention-based module. Identity recognition and age estimation tasks are introduced as supervision for a reliable disentanglement. Thus, we can obtain the target individualized cortical property maps with disentangled identity-related information with specific age-related information. Moreover, an adversarial learning strategy is integrated to achieve a vivid and realistic prediction. Extensive experiments validate our method’s superior capability in predicting early developing cortical surface maps flexibly and precisely, in comparison with existing methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 7","pages":"3110-3122"},"PeriodicalIF":0.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"QMix: Quality-Aware Learning With Mixed Noise for Robust Retinal Disease Diagnosis","authors":"Junlin Hou;Jilan Xu;Rui Feng;Hao Chen","doi":"10.1109/TMI.2025.3562614","DOIUrl":"10.1109/TMI.2025.3562614","url":null,"abstract":"Due to the complex nature of medical image acquisition and annotation, medical datasets inevitably contain noise. This adversely affects the robustness and generalization of deep neural networks. Previous noise learning methods mainly considered noise arising from images being mislabeled, i.e., label noise, assuming all mislabeled images were of high quality. However, medical images can also suffer from severe data quality issues, i.e., data noise, where discriminative visual features for disease diagnosis are missing. In this paper, we propose QMix, a noise learning framework that learns a robust disease diagnosis model under mixed noise scenarios. QMix alternates between sample separation and quality-aware semi-supervised training in each epoch. The sample separation phase uses a joint uncertainty-loss criterion to effectively separate <xref>(1)</xref> correctly labeled images, <xref>(2)</xref> mislabeled high-quality images, and <xref>(3)</xref> mislabeled low-quality images. The semi-supervised training phase then learns a robust disease diagnosis model from the separated samples. Specifically, we propose a sample-reweighing loss to mitigate the effect of mislabeled low-quality images during training, and a contrastive enhancement loss to further distinguish them from correctly labeled images. QMix achieved state-of-the-art performance on six public retinal image datasets and exhibited significant improvements in robustness against mixed noise. Code will be available upon acceptance.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3345-3355"},"PeriodicalIF":0.0,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143857707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Photoacoustic-Strain (PAS) Imaging for Tissue Microcirculation Assessment","authors":"Xinyue Huang;David Qin;Samuel Morais;Xing Long;Stanislav Emelianov","doi":"10.1109/TMI.2025.3562141","DOIUrl":"10.1109/TMI.2025.3562141","url":null,"abstract":"Microcirculation facilitates the exchange of gases, nutrients, and waste products between blood and tissues and is critical for maintaining systemic tissue health. In this study, we introduce photoacoustic-strain (PAS) imaging, a non-invasive method for assessing tissue microcirculation. By combining externally applied deformation while performing ultrasound/photoacoustic imaging, PAS enables real-time monitoring of blood volume changes as blood is displaced from the tissue under pressure and subsequently refilled upon pressure release. Through a series of post-processing steps, spatially resolved maps of blood displacement and reperfusion rates are reconstructed, providing insights into microcirculatory dynamics. Using both in silico and in vivo experiments, PAS imaging was developed and validated, demonstrating its capability to detect and differentiate changes in microcirculation under various conditions, including vascular occlusion and tissue at different temperatures, with high sensitivity and strong robustness. These findings underscore the potential of PAS imaging as a non-invasive tool for understanding and diagnosing diseases associated with microcirculatory function.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3299-3310"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143849724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vertex Correspondence and Self-Intersection Reduction in Cortical Surface Reconstruction","authors":"Anne-Marie Rickmann;Fabian Bongratz;Christian Wachinger","doi":"10.1109/TMI.2025.3562443","DOIUrl":"10.1109/TMI.2025.3562443","url":null,"abstract":"Mesh-based cortical surface reconstruction is essential for neuroimaging, enabling precise measurements of brain morphology such as cortical thickness. Establishing vertex correspondence between individual cortical meshes and group templates allows vertex-level comparisons, but traditional methods require time-consuming post-processing steps to achieve vertex correspondence. While deep learning has improved accuracy in cortical surface reconstruction, optimizing vertex correspondence has not been the focus of prior work. We introduce Vox2Cortex with Correspondence (V2CC), an extension of Vox2Cortex, which replaces the commonly used Chamfer loss with L1 loss on registered surfaces. This approach improves inter- and intra-subject correspondence, which makes it suitable for direct group comparisons and atlas-based parcellation. Additionally, we analyze mesh self-intersections, categorizing them into minor (neighboring faces) and major (non-neighboring faces) types.To address major self-intersections, which are not effectively handled by standard regularization losses, we propose a novel Self-Proximity loss, designed to adjust non-neighboring vertices within a defined proximity threshold. Comprehensive evaluations demonstrate that recent deep learning methods inadequately address vertex correspondence, often causing inaccuracies in parcellation. In contrast, our method achieves accurate correspondence and reduces self-intersections to below 1% for both pial and white matter surfaces.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3258-3269"},"PeriodicalIF":0.0,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143849694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Path-Based Model for Aberration Correction in Ultrasound Imaging","authors":"Baptiste Hériard-Dubreuil;Adrien Besson;Claude Cohen-Bacrie;Jean-Philippe Thiran","doi":"10.1109/TMI.2025.3562011","DOIUrl":"10.1109/TMI.2025.3562011","url":null,"abstract":"Pulse-Echo Ultrasound Imaging suffers from several sources of image degradation. In clinical conditions, superficial layers made of different tissues (e.g. skin, fat or muscles) create aberrations that can severely deteriorate image quality. To correct such aberrations, the majority of existing methods use either phase screens or speed of sound maps. However, a technique that is both accurate in real-world scenarios and compatible with near-real time imaging is lacking. Indeed phase screens are too simplistic to be physically accurate and speed of sound maps are computationally costly to estimate. We propose a new model of aberrations driven by the paths followed by ultrasound waves in the aberrating layer. With this new representation, we formulate an optimization problem in which a coherence factor is maximized with respect to a grid of aberrating paths. This problem is solved via a gradient ascent algorithm with variable splitting, in which all necessary gradients are expressed analytically. Using simulations of aberrating layers, we show that the proposed method can correct strong aberrations (i.e. of several periods) and outperforms a state-of-the-art technique based on speed of sound maps. Using in vivo experiments, we demonstrate that the proposed method is able to correct real aberrations in a few seconds which represents a major step forward towards a broader use of aberration correction methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3222-3232"},"PeriodicalIF":0.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143847271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zig-RiR: Zigzag RWKV-in-RWKV for Efficient Medical Image Segmentation","authors":"Tianxiang Chen;Xudong Zhou;Zhentao Tan;Yue Wu;Ziyang Wang;Zi Ye;Tao Gong;Qi Chu;Nenghai Yu;Le Lu","doi":"10.1109/TMI.2025.3561797","DOIUrl":"10.1109/TMI.2025.3561797","url":null,"abstract":"Medical image segmentation has made significant strides with the development of basic models. Specifically, models that combine CNNs with transformers can successfully extract both local and global features. However, these models inherit the transformer’s quadratic computational complexity, limiting their efficiency. Inspired by the recent Receptance Weighted Key Value (RWKV) model, which achieves linear complexity for long-distance modeling, we explore its potential for medical image segmentation. While directly applying vision-RWKV yields suboptimal results due to insufficient local feature exploration and disrupted spatial continuity, we propose a novel nested structure, Zigzag RWKV-in-RWKV (Zig-RiR), to address these issues. It consists of Outer and Inner RWKV blocks to adeptly capture both global and local features without disrupting spatial continuity. We treat local patches as “visual sentences” and use the Outer Zig-RWKV to explore global information. Then, we decompose each sentence into sub-patches (“visual words”) and use the Inner Zig-RWKV to further explore local information among words, at negligible computational cost. We also introduce a Zigzag-WKV attention mechanism to ensure spatial continuity during token scanning. By aggregating visual word and sentence features, our Zig-RiR can effectively explore both global and local information while preserving spatial continuity. Experiments on four medical image segmentation datasets of both 2D and 3D modalities demonstrate the superior accuracy and efficiency of our method, outperforming the state-of-the-art method 14.4 times in speed and reducing GPU memory usage by 89.5% when testing on <inline-formula> <tex-math>${1024} times {1024}$ </tex-math></inline-formula> high-resolution medical images. Our code is available at <uri>https://github.com/txchen-USTC/Zig-RiR</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3245-3257"},"PeriodicalIF":0.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143847136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decoder-Only Image Registration","authors":"Xi Jia;Wenqi Lu;Xinxing Cheng;Jinming Duan","doi":"10.1109/TMI.2025.3562056","DOIUrl":"10.1109/TMI.2025.3562056","url":null,"abstract":"In unsupervised medical image registration, encoder-decoder architectures are widely used to predict dense, full-resolution displacement fields from paired images. Despite their popularity, we question the necessity of making both the encoder and decoder learnable. To address this, we propose LessNet, a simplified network architecture with only a learnable decoder, while completely omitting a learnable encoder. Instead, LessNet replaces the encoder with simple, handcrafted features, eliminating the need to optimize encoder parameters. This results in a compact, efficient, and decoder-only architecture for 3D medical image registration. We evaluate our decoder-only LessNet on five registration tasks: 1) inter-subject brain registration using the OASIS-1 dataset, 2) atlas-based brain registration using the IXI dataset, 3) cardiac ES-ED registration using the ACDC dataset, 4) inter-subject abdominal MR registration using the CHAOS dataset, and 5) multi-study, multi-site brain registration using images from 13 public datasets. Our results demonstrate that LessNet can effectively and efficiently learn both dense displacement and diffeomorphic deformation fields. Furthermore, our decoder-only LessNet can achieve comparable registration performance to benchmarking methods such as VoxelMorph and TransMorph, while requiring significantly fewer computational resources. Our code and pre-trained models are available at <uri>https://github.com/xi-jia/LessNet</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 8","pages":"3356-3369"},"PeriodicalIF":0.0,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143847265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}