Medical image analysis最新文献

筛选
英文 中文
AdaptFRCNet: Semi-supervised adaptation of pre-trained model with frequency and region consistency for medical image segmentation AdaptFRCNet:基于频率和区域一致性的预训练模型的半监督自适应医学图像分割
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-13 DOI: 10.1016/j.media.2025.103626
Along He , Yanlin Wu , Zhihong Wang , Tao Li , Huazhu Fu
{"title":"AdaptFRCNet: Semi-supervised adaptation of pre-trained model with frequency and region consistency for medical image segmentation","authors":"Along He ,&nbsp;Yanlin Wu ,&nbsp;Zhihong Wang ,&nbsp;Tao Li ,&nbsp;Huazhu Fu","doi":"10.1016/j.media.2025.103626","DOIUrl":"10.1016/j.media.2025.103626","url":null,"abstract":"<div><div>Recently, large pre-trained models (LPM) have achieved great success, which provides rich feature representation for downstream tasks. Pre-training and then fine-tuning is an effective way to utilize LPM. However, the application of LPM in the medical domain is hindered by the presence of a large number of parameters and a limited amount of labeled data. In clinical practice, there exists a substantial amount of unlabeled data that remains underutilized. Semi-supervised learning emerges as a promising approach to harnessing these unlabeled data effectively. In this paper, we propose semi-supervised adaptation of pre-trained model with frequency and region consistency (AdaptFRCNet) for medical image segmentation. Specifically, the pre-trained model is frozen and the proposed lightweight attention-based adapters (Att_Adapter) are inserted into the frozen backbone for parameter-efficient fine-tuning (PEFT). We propose two consistency regularization strategies for semi-supervised learning: frequency domain consistency (FDC) and multi-granularity region similarity consistency (MRSC). FDC aids in learning features within the frequency domain, and MRSC aims to achieve multiple region-level feature consistencies, capturing local context information effectively. By leveraging the proposed Att_Adapter along with FDC and MRSC, we can effectively and efficiently harness the powerful feature representation capability of the LPM. We conduct extensive experiments on three medical image segmentation datasets, demonstrating significant performance improvements over other state-of-the-art methods. The code is available at <span><span>https://github.com/NKUhealong/AdaptFRCNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103626"},"PeriodicalIF":10.7,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of deep-learning-based radiology report generation using multimodal inputs 使用多模态输入的基于深度学习的放射学报告生成的调查
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-13 DOI: 10.1016/j.media.2025.103627
Xinyi Wang , Grazziela Figueredo , Ruizhe Li , Wei Emma Zhang , Weitong Chen , Xin Chen
{"title":"A survey of deep-learning-based radiology report generation using multimodal inputs","authors":"Xinyi Wang ,&nbsp;Grazziela Figueredo ,&nbsp;Ruizhe Li ,&nbsp;Wei Emma Zhang ,&nbsp;Weitong Chen ,&nbsp;Xin Chen","doi":"10.1016/j.media.2025.103627","DOIUrl":"10.1016/j.media.2025.103627","url":null,"abstract":"<div><div>Automatic radiology report generation can alleviate the workload for physicians and minimize regional disparities in medical resources, therefore becoming an important topic in the medical image analysis field. It is a challenging task, as the computational model needs to mimic physicians to obtain information from multi-modal input data (i.e., medical images, clinical information, medical knowledge, etc.), and produce comprehensive and accurate reports. Recently, numerous works have emerged to address this issue using deep-learning-based methods, such as transformers, contrastive learning, and knowledge-base construction. This survey summarizes the key techniques developed in the most recent works and proposes a general workflow for deep-learning-based report generation with five main components, including multi-modality data acquisition, data preparation, feature learning, feature fusion and interaction, and report generation. The state-of-the-art methods for each of these components are highlighted. Additionally, we summarize the latest developments in large model-based methods and model explainability, along with public datasets, evaluation methods, current challenges, and future directions in this field. We have also conducted a quantitative comparison between different methods in the same experimental setting. This is the most up-to-date survey that focuses on multi-modality inputs and data fusion for radiology report generation. The aim is to provide comprehensive and rich information for researchers interested in automatic clinical report generation and medical image analysis, especially when using multimodal inputs, and to assist them in developing new algorithms to advance the field.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103627"},"PeriodicalIF":10.7,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144071883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Driven by textual knowledge: A Text-View Enhanced Knowledge Transfer Network for lung infection region segmentation 文本知识驱动:基于文本视图的肺部感染区域分割知识转移网络
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-12 DOI: 10.1016/j.media.2025.103625
Lexin Fang , Xuemei Li , Yunyang Xu , Fan Zhang , Caiming Zhang
{"title":"Driven by textual knowledge: A Text-View Enhanced Knowledge Transfer Network for lung infection region segmentation","authors":"Lexin Fang ,&nbsp;Xuemei Li ,&nbsp;Yunyang Xu ,&nbsp;Fan Zhang ,&nbsp;Caiming Zhang","doi":"10.1016/j.media.2025.103625","DOIUrl":"10.1016/j.media.2025.103625","url":null,"abstract":"<div><div>Lung infections are the leading cause of death among infectious diseases, and accurate segmentation of the infected lung area is crucial for effective treatment. Currently, segmentation methods that rely solely on imaging data have limited accuracy. Incorporating text information enriched with expert knowledge into the segmentation process has emerged as a novel approach. However, previous methods often used unified text encoding strategies for extracting textual features. It failed to adequately emphasize critical details in the text, particularly the spatial location of infected regions. Moreover, the semantic space inconsistency between text and image features complicates cross-modal information transfer. To close these gaps, we propose a <strong>Text-View Enhanced Knowledge Transfer Network (TVE-Net)</strong> that leverages key information from textual data to assist in segmentation and enhance the model’s perception of lung infection locations. The method generates a text view by probabilistically modeling the location information of infected areas in text using a robust, carefully designed positional probability function. By assigning lesion probabilities to each image region, the infected areas’ spatial information from the text view is explicitly integrated into the segmentation model. Once the text view has been introduced, a unified image encoder can be employed to extract text view features, so that both text and images are mapped into the same space. In addition, a self-supervised constraint based on text-view overlap and feature consistency is proposed to enhance the model’s robustness and semi-supervised capability through feature augmentation. Meanwhile, the newly designed multi-stage knowledge transfer module utilizes a globally enhanced cross-attention mechanism to comprehensively learn the implicit correlations between image features and text-view features, enabling effective knowledge transfer from text-view features to image features. Extensive experiments demonstrate that TVE-Net outperforms both unimodal and multimodal methods in both fully supervised and semi-supervised lung infection segmentation tasks, achieving significant improvements on QaTa-COV19 and MosMedData+ datasets.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103625"},"PeriodicalIF":10.7,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure-guided MR-to-CT synthesis with spatial and semantic alignments for attenuation correction of whole-body PET/MR imaging 结构引导MR- ct合成与空间和语义对齐,用于全身PET/MR成像的衰减校正。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-10 DOI: 10.1016/j.media.2025.103622
Jiaxu Zheng , Zhenrong Shen , Lichi Zhang , Qun Chen
{"title":"Structure-guided MR-to-CT synthesis with spatial and semantic alignments for attenuation correction of whole-body PET/MR imaging","authors":"Jiaxu Zheng ,&nbsp;Zhenrong Shen ,&nbsp;Lichi Zhang ,&nbsp;Qun Chen","doi":"10.1016/j.media.2025.103622","DOIUrl":"10.1016/j.media.2025.103622","url":null,"abstract":"<div><div>Image synthesis from Magnetic Resonance (MR) to Computed Tomography (CT) can estimate the electron density of tissues, thereby facilitating Positron Emission Tomography (PET) attenuation correction in whole-body PET/MR imaging. Whole-body MR-to-CT synthesis faces several challenges including the spatial misalignment caused by tissue variety and respiratory movements, and the complex intensity mapping due to large intensity variations across the whole body. However, existing MR-to-CT synthesis methods mainly focus on body sub-regions, making them ineffective in addressing these challenges. Here we propose a novel whole-body MR-to-CT synthesis framework, which consists of three novel modules to tackle these challenges: (1) Structure-Guided Synthesis module leverages structure-guided attention gates to enhance synthetic image quality by diminishing unnecessary contours of soft tissues; (2) Spatial Alignment module yields precise registration between paired MR and CT images by taking into account the impacts of tissue volumes and respiratory movements, thus providing well-aligned ground-truth CT images during training; (3) Semantic Alignment module utilizes contrastive learning to constrain organ-related semantic information, thereby ensuring the semantic authenticity of synthetic CT images. Extensive experiments demonstrate that our method produces visually plausible and semantically accurate CT images, outperforming existing approaches in both synthetic image quality and PET attenuation correction accuracy.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103622"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144031343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Next-generation surgical navigation: Marker-less multi-view 6DoF pose estimation of surgical instruments 下一代手术导航:无标记的多视角6DoF手术器械姿态估计
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-10 DOI: 10.1016/j.media.2025.103613
Jonas Hein , Nicola Cavalcanti , Daniel Suter , Lukas Zingg , Fabio Carrillo , Lilian Calvet , Mazda Farshad , Nassir Navab , Marc Pollefeys , Philipp Fürnstahl
{"title":"Next-generation surgical navigation: Marker-less multi-view 6DoF pose estimation of surgical instruments","authors":"Jonas Hein ,&nbsp;Nicola Cavalcanti ,&nbsp;Daniel Suter ,&nbsp;Lukas Zingg ,&nbsp;Fabio Carrillo ,&nbsp;Lilian Calvet ,&nbsp;Mazda Farshad ,&nbsp;Nassir Navab ,&nbsp;Marc Pollefeys ,&nbsp;Philipp Fürnstahl","doi":"10.1016/j.media.2025.103613","DOIUrl":"10.1016/j.media.2025.103613","url":null,"abstract":"<div><div>State-of-the-art research of traditional computer vision is increasingly leveraged in the surgical domain. A particular focus in computer-assisted surgery is to replace marker-based tracking systems for instrument localization with pure image-based 6DoF pose estimation using deep-learning methods. However, state-of-the-art single-view pose estimation methods do not yet meet the accuracy required for surgical navigation. In this context, we investigate the benefits of multi-view setups for highly accurate and occlusion-robust 6DoF pose estimation of surgical instruments and derive recommendations for an ideal camera system that addresses the challenges in the operating room. Our contributions are threefold. First, we present a multi-view RGB-D video dataset of ex-vivo spine surgeries, captured with static and head-mounted cameras and including rich annotations for surgeon, instruments, and patient anatomy. Second, we perform an extensive evaluation of three state-of-the-art single-view and multi-view pose estimation methods, analyzing the impact of camera quantities and positioning, limited real-world data, and static, hybrid, or fully mobile camera setups on the pose accuracy, occlusion robustness, and generalizability. Third, we design a multi-camera system for marker-less surgical instrument tracking, achieving an average position error of 1.01<!--> <!-->mm and orientation error of 0.89° for a surgical drill, and 2.79<!--> <!-->mm and 3.33° for a screwdriver under optimal conditions. Our results demonstrate that marker-less tracking of surgical instruments is becoming a feasible alternative to existing marker-based systems.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103613"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nested hierarchical group-wise registration with a graph-based subgrouping strategy for efficient template construction 嵌套分层分组明智注册与基于图的子分组策略,有效的模板构建
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-10 DOI: 10.1016/j.media.2025.103624
Tongtong Che , Lin Zhang , Debin Zeng , Yan Zhao , Haoying Bai , Jichang Zhang , Xiuying Wang , Shuyu Li
{"title":"Nested hierarchical group-wise registration with a graph-based subgrouping strategy for efficient template construction","authors":"Tongtong Che ,&nbsp;Lin Zhang ,&nbsp;Debin Zeng ,&nbsp;Yan Zhao ,&nbsp;Haoying Bai ,&nbsp;Jichang Zhang ,&nbsp;Xiuying Wang ,&nbsp;Shuyu Li","doi":"10.1016/j.media.2025.103624","DOIUrl":"10.1016/j.media.2025.103624","url":null,"abstract":"<div><div>Accurate and efficient group-wise registration for medical images is fundamentally important to construct a common template image for population-level analysis. However, current group-wise registration faces the challenges posed by the algorithm’s efficiency and capacity, and adaptability to large variations in the subject populations. This paper addresses these challenges with a novel Nested Hierarchical Group-wise Registration (NHGR) framework. Firstly, to alleviate the registration burden due to significant population variations, a new subgrouping strategy is proposed to serve as a “divide and conquer” mechanism that divides a large population into smaller subgroups. The subgroups with a hierarchical sequence are formed by gradually expanding the scale factors that relate to feature similarity and then conducting registration at the subgroup scale as the multi-scale conquer strategy. Secondly, the nested hierarchical group-wise registration is proposed to conquer the challenges due to the efficiency and capacity of the model from three perspectives. (1) Population level: the global group-wise registration is performed to generate age-related sub-templates from local subgroups progressively to the global population. (2) Subgroup level: the local group-wise registration is performed based on local image distributions to reduce registration error and achieve rapid optimization of sub-templates. (3) Image pair level: a deep multi-resolution registration network is employed for better registration efficiency. The proposed framework was evaluated on the brain datasets of adults and adolescents, respectively from 18 to 96 years and 5 to 21 years. Experimental results consistently demonstrated that our proposed group-wise registration method achieved better performance in terms of registration efficiency, template sharpness, and template centrality.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103624"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Error correcting 2D–3D cascaded network for myocardial infarct scar segmentation on late gadolinium enhancement cardiac magnetic resonance images 校正2D-3D级联网络对晚期钆增强心脏磁共振图像心肌梗死疤痕分割的影响。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-10 DOI: 10.1016/j.media.2025.103594
Matthias Schwab , Mathias Pamminger , Christian Kremser , Daniel Obmann , Markus Haltmeier , Agnes Mayr
{"title":"Error correcting 2D–3D cascaded network for myocardial infarct scar segmentation on late gadolinium enhancement cardiac magnetic resonance images","authors":"Matthias Schwab ,&nbsp;Mathias Pamminger ,&nbsp;Christian Kremser ,&nbsp;Daniel Obmann ,&nbsp;Markus Haltmeier ,&nbsp;Agnes Mayr","doi":"10.1016/j.media.2025.103594","DOIUrl":"10.1016/j.media.2025.103594","url":null,"abstract":"<div><div>Late gadolinium enhancement (LGE) cardiac magnetic resonance (CMR) imaging is considered the in vivo reference standard for assessing infarct size (IS) and microvascular obstruction (MVO) in ST-elevation myocardial infarction (STEMI) patients. However, the exact quantification of those markers of myocardial infarct severity remains challenging and very time-consuming. As LGE distribution patterns can be quite complex and hard to delineate from the blood pool or epicardial fat, automatic segmentation of LGE CMR images is challenging. In this work, we propose a cascaded framework of two-dimensional and three-dimensional convolutional neural networks (CNNs) which enables to calculate the extent of myocardial infarction in a fully automated way. By artificially generating segmentation errors which are characteristic for 2D CNNs during training of the cascaded framework we are enforcing the detection and correction of 2D segmentation errors and hence improve the segmentation accuracy of the entire method. The proposed method was trained and evaluated on two publicly available datasets. We perform comparative experiments where we show that our framework outperforms state-of-the-art reference methods in segmentation of myocardial infarction. Furthermore, in extensive ablation studies we show the advantages that come with the proposed error correcting cascaded method. The code of this project is publicly available at <span><span>https://github.com/matthi99/EcorC.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103594"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144064025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning dissection trajectories from expert surgical videos via imitation learning with equivariant diffusion 通过等变扩散的模仿学习从专家手术视频中学习解剖轨迹
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-10 DOI: 10.1016/j.media.2025.103599
Hongyu Wang , Yonghao Long , Yueyao Chen , Hon-Chi Yip , Markus Scheppach , Philip Wai-Yan Chiu , Yeung Yam , Helen Mei-Ling Meng , Qi Dou
{"title":"Learning dissection trajectories from expert surgical videos via imitation learning with equivariant diffusion","authors":"Hongyu Wang ,&nbsp;Yonghao Long ,&nbsp;Yueyao Chen ,&nbsp;Hon-Chi Yip ,&nbsp;Markus Scheppach ,&nbsp;Philip Wai-Yan Chiu ,&nbsp;Yeung Yam ,&nbsp;Helen Mei-Ling Meng ,&nbsp;Qi Dou","doi":"10.1016/j.media.2025.103599","DOIUrl":"10.1016/j.media.2025.103599","url":null,"abstract":"<div><div>Endoscopic Submucosal Dissection (ESD) constitutes a firmly well-established technique within endoscopic resection for the elimination of epithelial lesions. Dissection trajectory prediction in ESD videos has the potential to strengthen surgical skills training and simplify surgical skills training. However, this approach has been seldom explored in previous research. While imitation learning has proven effective in learning skills from expert demonstrations, it encounters difficulties in predicting uncertain future movements, learning geometric symmetries and generalizing to diverse surgical scenarios. This paper introduces imitation learning for the critical task of predicting dissection trajectories from expert video demonstrations. We propose a novel Implicit Diffusion Policy with Equivariant Representations for Imitation Learning (iDPOE) to address this variability. Our method implicitly models expert behaviors using a joint state–action distribution, capturing the inherent stochasticity of future dissection trajectories and enabling robust visual representation learning across various endoscopic views. By incorporating a diffusion model in policy learning, our approach facilitates efficient training and sampling, resulting in more accurate predictions and improved generalization. Additionally, we integrate equivariance into the learning process to enhance the model’s ability to generalize to geometric symmetries in trajectory prediction. To enable conditional sampling from the implicit policy, we develop a forward-process guided action inference strategy to correct state mismatches. We evaluated our method using a collected ESD video dataset comprising nearly 2000 clips. Experimental results demonstrate that our approach outperforms both explicit and implicit state-of-the-art methods in trajectory prediction. As far as we know, this is the first endeavor to utilize imitation learning-based techniques for surgical skill learning in terms of dissection trajectory prediction.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103599"},"PeriodicalIF":10.7,"publicationDate":"2025-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidence intervals for performance estimates in brain MRI segmentation 脑MRI分割中性能估计的置信区间
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-08 DOI: 10.1016/j.media.2025.103565
Rosana El Jurdi , Gaël Varoquaux , Olivier Colliot
{"title":"Confidence intervals for performance estimates in brain MRI segmentation","authors":"Rosana El Jurdi ,&nbsp;Gaël Varoquaux ,&nbsp;Olivier Colliot","doi":"10.1016/j.media.2025.103565","DOIUrl":"10.1016/j.media.2025.103565","url":null,"abstract":"<div><div>Medical segmentation models are evaluated empirically. As such an evaluation is based on a limited set of example images, it is unavoidably noisy. Beyond a mean performance measure, reporting confidence intervals is thus crucial. However, this is rarely done in medical image segmentation. The width of the confidence interval depends on the test set size and on the spread of the performance measure (its standard-deviation across the test set). For classification, many test images are needed to avoid wide confidence intervals. Segmentation, however, has not been studied, and it differs by the amount of information brought by a given test image. In this paper, we study the typical confidence intervals in the context of segmentation in 3D brain magnetic resonance imaging (MRI). We carry experiments on using the standard nnU-net framework, two datasets from the Medical Decathlon challenge that concern brain MRI (hippocampus and brain tumor segmentation) and two performance measures: the Dice Similarity Coefficient and the Hausdorff distance. We show that the parametric confidence intervals are reasonable approximations of the bootstrap estimates for varying test set sizes and spread of the performance metric. Importantly, we show that the test size needed to achieve a given precision is often much lower than for classification tasks. Typically, a 1% wide confidence interval requires about 100–200 test samples when the spread is low (standard-deviation around 3%). More difficult segmentation tasks may lead to higher spreads and require over 1000 samples. The corresponding code and notebooks are available on GitHub at <span><span>https://github.com/rosanajurdi/SegVal_Repo</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103565"},"PeriodicalIF":10.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144069985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CausalMixNet: A mixed-attention framework for causal intervention in robust medical image diagnosis CausalMixNet:一个用于稳健医学图像诊断中因果干预的混合关注框架。
IF 10.7 1区 医学
Medical image analysis Pub Date : 2025-05-08 DOI: 10.1016/j.media.2025.103581
Yajie Zhang , Yu-An Huang , Yao Hu , Rui Liu , Jibin Wu , Zhi-An Huang , Kay Chen Tan
{"title":"CausalMixNet: A mixed-attention framework for causal intervention in robust medical image diagnosis","authors":"Yajie Zhang ,&nbsp;Yu-An Huang ,&nbsp;Yao Hu ,&nbsp;Rui Liu ,&nbsp;Jibin Wu ,&nbsp;Zhi-An Huang ,&nbsp;Kay Chen Tan","doi":"10.1016/j.media.2025.103581","DOIUrl":"10.1016/j.media.2025.103581","url":null,"abstract":"<div><div>Confounding factors inherent in medical images can significantly impact the causal exploration capabilities of deep learning models, resulting in compromised accuracy and diminished generalization performance. In this paper, we present an innovative methodology named CausalMixNet that employs query-mixed intra-attention and key&amp;value-mixed inter-attention to probe causal relationships between input images and labels. For mitigating unobservable confounding factors, CausalMixNet integrates the non-local reasoning module (NLRM) and the key&amp;value-mixed inter-attention (KVMIA) to conduct a front-door adjustment strategy. Furthermore, CausalMixNet incorporates a patch-masked ranking module (PMRM) and query-mixed intra-attention (QMIA) to enhance mediator learning, thereby facilitating causal intervention. The patch mixing mechanism applied to query/(key&amp;value) features within QMIA and KVMIA specifically targets lesion-related feature enhancement and the inference of average causal effect inference. CausalMixNet consistently outperforms existing methods, achieving superior accuracy and F1-scores across in-domain and out-of-domain scenarios on multiple datasets, with an average improvement of 3% over the closest competitor. Demonstrating robustness against noise, gender bias, and attribute bias, CausalMixNet excels in handling unobservable confounders, maintaining stable performance even in challenging conditions.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"103 ","pages":"Article 103581"},"PeriodicalIF":10.7,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144064023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信