Medical image analysisPub Date : 2026-05-01Epub Date: 2026-02-09DOI: 10.1016/j.media.2026.103986
Yaqi Wang , Zhi Li , Chengyu Wu , Jun Liu , Yifan Zhang , Jiaxue Ni , Qian Luo , Jialuo Chen , Hongyuan Zhang , Jin Liu , Can Han , Kaiwen Fu , Changkai Ji , Xinxu Cai , Jing Hao , Zhihao Zheng , Shi Xu , Junqiang Chen , Xiaoyang Yu , Qianni Zhang , Huiyu Zhou
{"title":"MICCAI STS 2024 challenge: Semi-supervised instance-level tooth segmentation in panoramic X-ray and CBCT images","authors":"Yaqi Wang , Zhi Li , Chengyu Wu , Jun Liu , Yifan Zhang , Jiaxue Ni , Qian Luo , Jialuo Chen , Hongyuan Zhang , Jin Liu , Can Han , Kaiwen Fu , Changkai Ji , Xinxu Cai , Jing Hao , Zhihao Zheng , Shi Xu , Junqiang Chen , Xiaoyang Yu , Qianni Zhang , Huiyu Zhou","doi":"10.1016/j.media.2026.103986","DOIUrl":"10.1016/j.media.2026.103986","url":null,"abstract":"<div><div>Orthopantomogram (OPGs) and Cone-Beam Computed Tomography (CBCT) are vital for dentistry, but creating large datasets for automated tooth segmentation is hindered by the labor-intensive process of manual instance-level annotation. This research aimed to benchmark and advance semi-supervised learning (SSL) as a solution for this data scarcity problem. We organized the 2nd Semi-supervised Teeth Segmentation (STS 2024) Challenge at MICCAI 2024. We provided a large-scale dataset comprising over 90,000 2D images and 3D axial slices, which includes 2380 OPG images and 330 CBCT scans, all featuring detailed instance-level FDI annotations on part of the data. The challenge attracted 114 (OPG) and 106 (CBCT) registered teams. To ensure algorithmic excellence and full transparency, we rigorously evaluated the valid, open-source submissions from the top 10 (OPG) and top 5 (CBCT) teams, respectively. All successful submissions were deep learning-based SSL methods. The winning semi-supervised models demonstrated impressive performance gains over a fully-supervised nnU-Net baseline trained only on the labeled data. For the 2D OPG track, the top method improved the Instance Affinity (IA) score by over 44 percentage points. For the 3D CBCT track, the winning approach boosted the Instance Dice score by 61 percentage points. This challenge demonstrates the potential benefit benefit of SSL for complex, instance-level medical image segmentation tasks where labeled data is scarce. The most effective approaches consistently leveraged hybrid semi-supervised frameworks that combined knowledge from foundational models like SAM with multi-stage, coarse-to-fine refinement pipelines. Both the challenge dataset and the participants’ submitted code have been made publicly available on GitHub (<span><span>https://github.com/ricoleehduu/STS-Challenge-2024</span><svg><path></path></svg></span>), ensuring transparency and reproducibility.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103986"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146146639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-02-10DOI: 10.1016/j.media.2026.103993
Yueying Li , Rui Dong , Xiaoyun Liu , Yonggui Yuan , Youyong Kong
{"title":"Neurobridge: Bridging functional and structural brain networks via neural coupling and consistency-Guided dynamic graph learning","authors":"Yueying Li , Rui Dong , Xiaoyun Liu , Yonggui Yuan , Youyong Kong","doi":"10.1016/j.media.2026.103993","DOIUrl":"10.1016/j.media.2026.103993","url":null,"abstract":"<div><div>Modern medical imaging provides important insights into brain network analysis. Functional brain networks are used to characterize the functional connectivity patterns in resting or task states, and structural brain networks reflect the integrity and connectivity strength of macro-scale pathways. However, there are differences in data structure, information representation and spatial resolution between the both, and how to effectively fuse the information from these two modalities to mine potential cross-modal representations has become a key challenge in current research. In this paper, we propose the <strong>NeuroBridge</strong>, which enable the interaction between different modalities and the extraction of discriminative joint representations through coupling at the macro-scale level for brain network analysis. Specifically, the <em>Neural Synergy Coupling Module (NeuSCM)</em> performs structural-functional coupling in terms of brain region receptive fields. In order to enhance the inter-modal high-level semantic spatial coherence, we propose the <em>Consistency Anchor Guidance Module (CAGM)</em> for semantic calibration and convergence control of the fused representation space. Finally the <em>Dynamic Association Parsing Module (DAP)</em> captures the complex relationships between nodes and is used for final prediction and biomarker extraction. We conduct extensive experiments on disease prediction and gender classification tasks, and our results show that the prediction accuracy of our method outperforms that of SOTA single and multimodal methods, while our study provides new insights into multimodal brain network analysis.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103993"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146152676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-02-11DOI: 10.1016/j.media.2026.103987
Jian-Qing Zheng , Yuanhan Mo , Yang Sun , Jiahua Li , Fuping Wu , Ziyang Wang , Tonia Vincent , Bartłomiej W Papież
{"title":"Deformation-Recovery diffusion model (DRDM): Instance deformation for image manipulation and synthesis","authors":"Jian-Qing Zheng , Yuanhan Mo , Yang Sun , Jiahua Li , Fuping Wu , Ziyang Wang , Tonia Vincent , Bartłomiej W Papież","doi":"10.1016/j.media.2026.103987","DOIUrl":"10.1016/j.media.2026.103987","url":null,"abstract":"<div><div>In medical imaging, diffusion models have shown great potential for synthetic image generation. However, these approaches often lack interpretable correspondence between generated and real images and can create anatomically implausible structures or illusions. To address these limitations, we propose the Deformation-Recovery Diffusion Model (DRDM), a novel diffusion-based generative model that emphasizes morphological transformation through deformation fields rather than direct image synthesis. DRDM introduces a topology-preserving deformation field generation strategy, which randomly samples and integrates multi-scale Deformation Velocity Fields (DVFs). DRDM is trained to learn to recover unrealistic deformation components, thus restoring randomly deformed images to a realistic distribution. This formulation enables the generation of diverse yet anatomically plausible deformations that preserve structural integrity, thereby improving data augmentation and synthesis for downstream tasks such as few-shot learning and image registration. Experiments on cardiac Magnetic Resonance Imaging and pulmonary Computed Tomography show that DRDM is capable of creating diverse, large-scale deformations, while maintaining anatomical plausibility of deformation fields. Additional evaluations on 2D image segmentation and 3D image registration tasks indicate notable performance gains, underscoring DRDM’s potential to enhance both image manipulation and generative modeling in medical imaging applications.</div><div>The project page: <span><span>https://jianqingzheng.github.io/def_diff_rec/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103987"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146153282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-01-17DOI: 10.1016/j.media.2026.103949
Kai Gao , Lubin Wang , Liang Li , Xiao Chen , Bin Lu , Yu-Wei Wang , Xue-Ying Li , Zi-Han Wang , Hui-Xian Li , Yi-Fan Liao , Li-Ping Cao , Guan-Mao Chen , Jian-Shan Chen , Tao Chen , Tao-Lin Chen , Yan-Rong Chen , Yu-Qi Cheng , Zhao-Song Chu , Shi-Xian Cui , Xi-Long Cui , Dewen Hu
{"title":"Transfer learning from 2D natural images to 4D fMRI brain images via geometric mapping","authors":"Kai Gao , Lubin Wang , Liang Li , Xiao Chen , Bin Lu , Yu-Wei Wang , Xue-Ying Li , Zi-Han Wang , Hui-Xian Li , Yi-Fan Liao , Li-Ping Cao , Guan-Mao Chen , Jian-Shan Chen , Tao Chen , Tao-Lin Chen , Yan-Rong Chen , Yu-Qi Cheng , Zhao-Song Chu , Shi-Xian Cui , Xi-Long Cui , Dewen Hu","doi":"10.1016/j.media.2026.103949","DOIUrl":"10.1016/j.media.2026.103949","url":null,"abstract":"<div><div>Functional magnetic resonance imaging (fMRI) allows real-time observation of brain activity through blood oxygen level-dependent (BOLD) signals and is extensively used in studies related to sex classification, age estimation, behavioral measurements prediction, and mental disorder diagnosis. However, the application of deep learning techniques to brain fMRI analysis is hindered by the small sample size of fMRI datasets. Transfer learning offers a solution to this problem, but most existing approaches are designed for large-scale 2D natural images. The heterogeneity between 4D fMRI data and 2D natural images makes direct model transfer infeasible. This study proposes a novel geometric mapping-based fMRI transfer learning method that enables transfer learning from 2D natural images to 4D fMRI brain images, bridging the transfer learning gap between fMRI data and natural images. The proposed Multi-scale Multi-domain Feature Aggregation (MMFA) module extracts effective aggregated features and reduces the dimensionality of fMRI data to 3D space. By treating the cerebral cortex as a folded Riemannian manifold in 3D space and mapping it into 2D space using surface geometric mapping, we make the transfer learning from 2D natural images to 4D brain images possible. Moreover, the topological relationships of the cerebral cortex are maintained with our method, and calculations are performed along the Riemannian manifold of the brain, effectively addressing signal interference problems. The experimental results based on the Human Connectome Project (HCP) dataset demonstrate the effectiveness of the proposed method. Our method achieved state-of-the-art performance in sex classification, age estimation, and behavioral measurement prediction tasks. Moreover, we propose a cascaded transfer learning approach for depression diagnosis, and proved its effectiveness on 23 depression datasets. In summary, the proposed fMRI transfer learning method, which accounts for the structural characteristics of the brain, is promising for applying transfer learning from natural images to brain fMRI images, significantly enhancing the performance in various fMRI analysis tasks.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103949"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145995243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual selective gleason pattern-aware multiple instance learning with uncertainty regularization for grade group prediction in histopathology images","authors":"Xinyu Hao , Hongming Xu , Jingdong Zhang , Qi Xu , Ilkka Pölönen , Fengyu Cong","doi":"10.1016/j.media.2026.104005","DOIUrl":"10.1016/j.media.2026.104005","url":null,"abstract":"<div><div>Accurate prediction of Gleason Grade Group (GG) is of great importance for prostate cancer risk stratification and treatment planning. Although multiple instance learning (MIL) methods have advanced Gleason grading, most existing studies overlook the domain knowledge that GG is determined by the joint contribution of different Gleason Patterns, thereby limiting both accuracy and interpretability. In this study, we propose DSPA-U-MIL, an uncertainty-driven dual-selective Gleason Pattern-aware MIL model for patient-level GG prediction. Our method learns representative features by integrating learnable pattern aggregation tokens with expert concept-guided patch-level aggregation, and incorporates a teacher-student knowledge distillation framework to simulate cooperative prediction among different Gleason Patterns. In addition, we introduce an uncertainty constraint to mitigate the impact of noisy labels and enhance prediction robustness. Extensive experiments on five datasets, comprising 10,809 whole slide images (WSIs) and 1133 tissue microarray (TMA) images, demonstrate that DSPA-U-MIL consistently outperforms existing MIL approaches in Gleason GG prediction. Among these datasets, our method achieves up to a 6.7% improvement in quadratic weighted kappa (QWK) score over the strong baseline CLAM, with the largest gain observed on TCGA-PRAD. Furthermore, the gating weight distributions of the student model are well aligned with pathologists’ Gleason Pattern annotations, reinforcing the interpretability of our approach. Our source code is available at <span><span>https://github.com/AlexNmSED/DSPA-MIL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 104005"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146778082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-01-29DOI: 10.1016/j.media.2026.103967
Yijie Li , Wei Zhang , Xi Zhu , Ye Wu , Yogesh Rathi , Lauren J. O'Donnell , Fan Zhang
{"title":"DDTracking: A diffusion model-based deep generative framework with local-global spatiotemporal modeling for diffusion MRI tractography","authors":"Yijie Li , Wei Zhang , Xi Zhu , Ye Wu , Yogesh Rathi , Lauren J. O'Donnell , Fan Zhang","doi":"10.1016/j.media.2026.103967","DOIUrl":"10.1016/j.media.2026.103967","url":null,"abstract":"<div><div>Diffusion MRI (dMRI) tractography is an advanced technique that uniquely enables in vivo mapping of brain fiber pathways. Traditional methods rely on tissue modeling to estimate fiber orientations for streamline propagation, which are computationally intensive and remain sensitive to noise and artifacts. Recent deep learning-based approaches enable data-driven fiber tracking by directly mapping dMRI signals to orientations, demonstrating both improved efficiency and accuracy. However, existing methods typically operate by either leveraging local signal information or learning global dependencies along streamlines. This paper presents DDTracking, a deep generative framework for tractography. One key innovation is the reformulation of streamline propagation as a conditional denoising diffusion process. To the best of our knowledge, this is the first work to apply diffusion models for fiber tracking. Our network architecture incorporates two new designs, including: (1) a dual-pathway encoding scheme that extracts complementary local spatial features and global temporal context, and (2) a conditional diffusion model module that integrates the spatiotemporal features to predict propagation orientations. All components are trained jointly in an end-to-end manner without any pretraining. In this way, DDTracking can capture fine-scale structural details at each point while ensuring long-range consistency across the entire streamline. We conduct a comprehensive evaluation across diverse datasets, including both synthetic and clinical data. Experiments demonstrate that DDTracking outperforms traditional model-based and state-of-the-art deep learning-based methods in terms of tracking accuracy and computational efficiency. Furthermore, our results highlight DDTracking’s high generalizability across heterogeneous datasets, spanning varying health conditions, age groups, imaging protocols, and scanner types. Code is available at: <span><span>https://github.com/yishengpoxiao/DDTracking.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103967"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-01-26DOI: 10.1016/j.media.2026.103964
Pablo Meseguer , Rocío del Amor , Valery Naranjo
{"title":"MIL-Adapter: Coupling multiple instance learning and vision-language adapters for few-shot slide-level classification","authors":"Pablo Meseguer , Rocío del Amor , Valery Naranjo","doi":"10.1016/j.media.2026.103964","DOIUrl":"10.1016/j.media.2026.103964","url":null,"abstract":"<div><div>Contrastive language-image pretraining has greatly enhanced visual representation learning and enabled zero-shot classification. Vision-language language models (VLM) have succeeded in few-shot learning by leveraging adaptation modules fine-tuned for specific downstream tasks. In computational pathology (CPath), accurate whole-slide image (WSI) prediction is crucial for aiding in cancer diagnosis, and multiple instance learning (MIL) remains essential for managing the gigapixel scale of WSIs. In the intersection between CPath and VLMs, the literature still lacks specific adapters that handle the particular complexity of the slides. To solve this gap, we introduce MIL-Adapter, a novel approach designed to obtain consistent slide-level classification under few-shot learning scenarios. In particular, our framework is the first to combine trainable MIL aggregation functions and lightweight visual-language adapters to improve the performance of histopathological VLMs. MIL-Adapter relies on textual ensemble learning to construct discriminative zero-shot prototypes. It is serves as a solid starting point, surpassing MIL models with randomly initialized classifiers in data-constrained settings. With our experimentation, we demonstrate the value of textual ensemble learning and the robust predictive performance of MIL-Adapter through diverse datasets and configurations of few-shot scenarios, while providing crucial insights on model interpretability. The code is publicly accessible in <span><span>https://github.com/cvblab/MIL-Adapter</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103964"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-02-09DOI: 10.1016/j.media.2026.103981
Md Kamrul Hasan , Guang Yang , Choon Hwai Yap
{"title":"An efficient, scalable, and adaptable plug-and-play temporal attention module for motion-guided cardiac segmentation with sparse temporal labels","authors":"Md Kamrul Hasan , Guang Yang , Choon Hwai Yap","doi":"10.1016/j.media.2026.103981","DOIUrl":"10.1016/j.media.2026.103981","url":null,"abstract":"<div><div>Cardiac anatomy segmentation is essential for clinical assessment of cardiac function and disease diagnosis to inform treatment and intervention. Deep learning (DL) has improved cardiac anatomy segmentation accuracy, especially when information on cardiac motion dynamics is integrated into the networks. Several methods for incorporating motion information have been proposed; however, existing methods are not yet optimal: adding the time dimension to input data causes high computational costs, and incorporating registration into the segmentation network remains computationally costly and can be affected by errors of registration, especially with non-DL registration. While attention-based motion modeling is promising, suboptimal design constrains its capacity to learn the complex and coherent temporal interactions inherent in cardiac image sequences. Here, we propose a novel approach to incorporating motion information in the DL segmentation networks: a computationally efficient yet robust Temporal Attention Module (TAM), modeled as a small, multi-headed, cross-temporal attention module, which can be plug-and-play inserted into a broad range of segmentation networks (CNN, transformer, or hybrid) without a drastic architecture modification. Extensive experiments on multiple cardiac imaging datasets, such as 2D echocardiography (CAMUS and EchoNet-Dynamic), 3D echocardiography (MITEA), and 3D cardiac MRI (ACDC), confirm that TAM consistently improves segmentation performance across datasets when added to a range of networks, including UNet, FCN8s, UNetR, SwinUNetR, and the recent I<sup>2</sup>UNet and DT-VNet. Integrating TAM into SAM yields a temporal SAM that reduces Hausdorff distance (HD) from 3.99 mm to 3.51 mm on the CAMUS dataset, while integrating TAM into a pre-trained MedSAM reduces HD from 3.04 to 2.06 pixels after fine-tuning on the EchoNet-Dynamic dataset. On the ACDC 3D dataset, our TAM-UNet and TAM-DT-VNet achieve substantial reductions in HD, from 7.97 mm to 4.23 mm and 6.87 mm to 4.74 mm, respectively. Additionally, TAM’s training does not require segmentation of ground truths from all time frames and can be achieved with sparse temporal annotation. TAM is thus a robust, generalizable, and adaptable solution for motion-awareness enhancement that is easily scaled from 2D to 3D. The code is available at <span><span>https://github.com/kamruleee51/TAM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103981"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146146638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image analysisPub Date : 2026-05-01Epub Date: 2026-02-11DOI: 10.1016/j.media.2026.103991
Chao Tang , Jun Liu , Yanfen Cui , Zhenhui Li , Xiuming Zhang , Su Yao , Huan Lin , Dacheng Yang , Zhishun Liu , Wei Zhao , Shiwei Luo , Ke Zhao , Yun Zhu , Guangjun Yang , Lixu Yan , Shuting Chen , Xiangtian Zhao , Yingqiu Huo , Zhiyang Chen , Hongbo Liu , Cheng Lu
{"title":"A hypergraph-based model for tumor prognosis using local and global information fusion on H&E-stained histology images","authors":"Chao Tang , Jun Liu , Yanfen Cui , Zhenhui Li , Xiuming Zhang , Su Yao , Huan Lin , Dacheng Yang , Zhishun Liu , Wei Zhao , Shiwei Luo , Ke Zhao , Yun Zhu , Guangjun Yang , Lixu Yan , Shuting Chen , Xiangtian Zhao , Yingqiu Huo , Zhiyang Chen , Hongbo Liu , Cheng Lu","doi":"10.1016/j.media.2026.103991","DOIUrl":"10.1016/j.media.2026.103991","url":null,"abstract":"<div><div>Prognostic variables play a critical role in guiding clinical treatment decisions for cancer patients. However, extracting prognostic information from gigapixel histopathology slides remains a significant challenge. While attention-based deep learning models trained on histologic images have been extensively investigated, existing approaches often fail to effectively model slide-level contextual information or demonstrate generalizability across diverse cancer types and multi-center datasets. We propose a Hypergraph-based Multi-instance Contrastive Reinforcement learning model (HeMiCoRe), which integrates cluster-restricted local features and cross-cluster global representations from 5196 H&E-stained slides across 10 cancer types, leveraging both morphological and spatial relationships. HeMiCoRe employs hypergraph neural networks to predict patient survival outcomes and achieves state-of-the-art (SOTA) performance on 8 cancer types, demonstrating superior generalization compared to existing weakly supervised methods. This framework holds promise for clinical adoption, offering a robust tool for cancer prognosis and supporting treatment decision-making.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103991"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146152675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SAM-driven cross prompting with adaptive sampling consistency for semi-supervised medical image segmentation","authors":"Juzheng Miao , Cheng Chen , Yuchen Yuan , Quanzheng Li , Pheng-Ann Heng","doi":"10.1016/j.media.2026.103973","DOIUrl":"10.1016/j.media.2026.103973","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) has achieved notable progress in medical image segmentation. To achieve effective SSL, a model needs to be able to efficiently learn from limited labeled data and effectively exploit knowledge from abundant unlabeled data. Recent developments in visual foundation models, such as the Segment Anything Model (SAM), have demonstrated remarkable adaptability with improved sample efficiency. To seamlessly harness foundation models in SSL, we propose a SAM-driven cross prompting framework with adaptive sampling and prompt consistency for semi-supervised medical image segmentation, named CPAC-SAM. Our method employs SAM’s unique prompt design and innovates a cross prompting strategy within a dual-branch framework to automatically generate prompts and supervision across two decoder branches, enabling effective learning from both scarce labeled and valuable unlabeled data. To ensure the quality of prompts for unlabeled data and provide meaningful supervision in the cross prompting scheme, we propose an innovative prototype-guided grid sampling strategy with adaptive intervals to simultaneously improve the reliability of the prompt selection area and ensure both adequate prompt density and complete target coverage. We further design a novel prompt consistency regularization to reduce SAM’s prompt sensitivity and to enhance the output invariance under different prompts. We validate our method on five medical image segmentation tasks, encompassing both 2D and 3D scenarios. The extensive experiments with different labeled-data ratios and modalities demonstrate the superiority of our proposed method over the state-of-the-art SSL methods, with more than 4.1% and 3.8% Dice improvement on the breast cancer segmentation task and left atrium segmentation task, respectively. Our code is available at: <span><span>https://github.com/JuzhengMiao/CPAC-SAM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"110 ","pages":"Article 103973"},"PeriodicalIF":11.8,"publicationDate":"2026-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146109925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}