Kai-Ni Wang , Haolin Wang , Guang-Quan Zhou , Yangang Wang , Ling Yang , Yang Chen , Shuo Li
{"title":"TSdetector: Temporal–Spatial self-correction collaborative learning for colonoscopy video detection","authors":"Kai-Ni Wang , Haolin Wang , Guang-Quan Zhou , Yangang Wang , Ling Yang , Yang Chen , Shuo Li","doi":"10.1016/j.media.2024.103384","DOIUrl":"10.1016/j.media.2024.103384","url":null,"abstract":"<div><div>CNN-based object detection models that strike a balance between performance and speed have been gradually used in polyp detection tasks. Nevertheless, accurately locating polyps within complex colonoscopy video scenes remains challenging since existing methods ignore two key issues: intra-sequence distribution heterogeneity and precision-confidence discrepancy. To address these challenges, we propose a novel Temporal–Spatial self-correction detector (TSdetector), which first integrates temporal-level consistency learning and spatial-level reliability learning to detect objects continuously. Technically, we first propose a global temporal-aware convolution, assembling the preceding information to dynamically guide the current convolution kernel to focus on global features between sequences. In addition, we designed a hierarchical queue integration mechanism to combine multi-temporal features through a progressive accumulation manner, fully leveraging contextual consistency information together with retaining long-sequence-dependency features. Meanwhile, at the spatial level, we advance a position-aware clustering to explore the spatial relationships among candidate boxes for recalibrating prediction confidence adaptively, thus eliminating redundant bounding boxes efficiently. The experimental results on three publicly available polyp video dataset show that TSdetector achieves the highest polyp detection rate and outperforms other state-of-the-art methods. The code can be available at <span><span>https://github.com/soleilssss/TSdetector</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103384"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HiFi-Syn: Hierarchical granularity discrimination for high-fidelity synthesis of MR images with structure preservation","authors":"Ziqi Yu , Botao Zhao , Shengjie Zhang , Xiang Chen , Fuhua Yan , Jianfeng Feng , Tingying Peng , Xiao-Yong Zhang","doi":"10.1016/j.media.2024.103390","DOIUrl":"10.1016/j.media.2024.103390","url":null,"abstract":"<div><div>Synthesizing medical images while preserving their structural information is crucial in medical research. In such scenarios, the preservation of anatomical content becomes especially important. Although recent advances have been made by incorporating instance-level information to guide translation, these methods overlook the spatial coherence of structural-level representation and the anatomical invariance of content during translation. To address these issues, we introduce hierarchical granularity discrimination, which exploits various levels of semantic information present in medical images. Our strategy utilizes three levels of discrimination granularity: pixel-level discrimination using a Brain Memory Bank, structure-level discrimination on each brain structure with a re-weighting strategy to focus on hard samples, and global-level discrimination to ensure anatomical consistency during translation. The image translation performance of our strategy has been evaluated on three independent datasets (UK Biobank, IXI, and BraTS 2018), and it has outperformed state-of-the-art algorithms. Particularly, our model excels not only in synthesizing normal structures but also in handling abnormal (pathological) structures, such as brain tumors, despite the variations in contrast observed across different imaging modalities due to their pathological characteristics. The diagnostic value of synthesized MR images containing brain tumors has been evaluated by radiologists. This indicates that our model may offer an alternative solution in scenarios where specific MR modalities of patients are unavailable. Extensive experiments further demonstrate the versatility of our method, providing unique insights into medical image translation.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103390"},"PeriodicalIF":10.7,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qiyuan Wang , Shang Zhao , Zikang Xu , S. Kevin Zhou
{"title":"LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation","authors":"Qiyuan Wang , Shang Zhao , Zikang Xu , S. Kevin Zhou","doi":"10.1016/j.media.2024.103387","DOIUrl":"10.1016/j.media.2024.103387","url":null,"abstract":"<div><div>Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel <strong>LACOSTE</strong> model that exploits <strong>L</strong>ocation-<strong>A</strong>gnostic <strong>CO</strong>ntexts in <strong>S</strong>tereo and <strong>TE</strong>mporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103387"},"PeriodicalIF":10.7,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142648651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Junyu Chen , Yihao Liu , Shuwen Wei , Zhangxing Bian , Shalini Subramanian , Aaron Carass , Jerry L. Prince , Yong Du
{"title":"A survey on deep learning in medical image registration: New technologies, uncertainty, evaluation metrics, and beyond","authors":"Junyu Chen , Yihao Liu , Shuwen Wei , Zhangxing Bian , Shalini Subramanian , Aaron Carass , Jerry L. Prince , Yong Du","doi":"10.1016/j.media.2024.103385","DOIUrl":"10.1016/j.media.2024.103385","url":null,"abstract":"<div><div>Deep learning technologies have dramatically reshaped the field of medical image registration over the past decade. The initial developments, such as regression-based and U-Net-based networks, established the foundation for deep learning in image registration. Subsequent progress has been made in various aspects of deep learning-based registration, including similarity measures, deformation regularizations, network architectures, and uncertainty estimation. These advancements have not only enriched the field of image registration but have also facilitated its application in a wide range of tasks, including atlas construction, multi-atlas segmentation, motion estimation, and 2D–3D registration. In this paper, we present a comprehensive overview of the most recent advancements in deep learning-based image registration. We begin with a concise introduction to the core concepts of deep learning-based image registration. Then, we delve into innovative network architectures, loss functions specific to registration, and methods for estimating registration uncertainty. Additionally, this paper explores appropriate evaluation metrics for assessing the performance of deep learning models in registration tasks. Finally, we highlight the practical applications of these novel techniques in medical imaging and discuss the future prospects of deep learning-based image registration.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"100 ","pages":"Article 103385"},"PeriodicalIF":10.7,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald M. Summers
{"title":"Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: A data-driven approach for improved classification","authors":"Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald M. Summers","doi":"10.1016/j.media.2024.103383","DOIUrl":"10.1016/j.media.2024.103383","url":null,"abstract":"<div><div>In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports for dataset releases. However, there is still room for improvement in label quality. These labelers typically output only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. Supervised deep learning models have also been developed for report labeling but lack adaptability, similar to rule-based systems. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) to extract and enhance findings labels on CXR reports. MAPLEZ extracts not only binary labels indicating the presence or absence of a finding but also the location, severity, and radiologists’ uncertainty about the finding. Over eight abnormalities from five test sets, we show that our method can extract these annotations with an increase of 3.6 percentage points (pp) in macro F1 score for categorical presence annotations and more than 20 pp increase in F1 score for the location annotations over competing labelers. Additionally, using the combination of improved annotations and multi-type annotations in classification supervision in a dataset of limited-resolution CXRs, we demonstrate substantial advancements in proof-of-concept classification quality, with an increase of 1.1 pp in AUROC over models trained with annotations from the best alternative approach. We share code and annotations.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103383"},"PeriodicalIF":10.7,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Roca , Grégory Kuchcinski , Jean-Pierre Pruvo , Dorian Manouvriez , Renaud Lopes , the Australian Imaging Biomarkers and Lifestyle flagship study of ageing , the Alzheimer’s Disease Neuroimage Initiative
{"title":"IGUANe: A 3D generalizable CycleGAN for multicenter harmonization of brain MR images","authors":"Vincent Roca , Grégory Kuchcinski , Jean-Pierre Pruvo , Dorian Manouvriez , Renaud Lopes , the Australian Imaging Biomarkers and Lifestyle flagship study of ageing , the Alzheimer’s Disease Neuroimage Initiative","doi":"10.1016/j.media.2024.103388","DOIUrl":"10.1016/j.media.2024.103388","url":null,"abstract":"<div><div>In MRI studies, the aggregation of imaging data from multiple acquisition sites enhances sample size but may introduce site-related variabilities that hinder consistency in subsequent analyses. Deep learning methods for image translation have emerged as a solution for harmonizing MR images across sites. In this study, we introduce IGUANe (Image Generation with Unified Adversarial Networks), an original 3D model that leverages the strengths of domain translation and straightforward application of style transfer methods for multicenter brain MR image harmonization. IGUANe extends CycleGAN by integrating an arbitrary number of domains for training through a many-to-one architecture. The framework based on domain pairs enables the implementation of sampling strategies that prevent confusion between site-related and biological variabilities. During inference, the model can be applied to any image, even from an unknown acquisition site, making it a universal generator for harmonization. Trained on a dataset comprising T1-weighted images from 11 different scanners, IGUANe was evaluated on data from unseen sites. The assessments included the transformation of MR images with traveling subjects, the preservation of pairwise distances between MR images within domains, the evolution of volumetric patterns related to age and Alzheimer’s disease (AD), and the performance in age regression and patient classification tasks. Comparisons with other harmonization and normalization methods suggest that IGUANe better preserves individual information in MR images and is more suitable for maintaining and reinforcing variabilities related to age and AD. Future studies may further assess IGUANe in other multicenter contexts, either using the same model or retraining it for applications to different image modalities. Codes and the trained IGUANe model are available at <span><span>https://github.com/RocaVincent/iguane_harmonization.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103388"},"PeriodicalIF":10.7,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheyuan Zhang , Elif Keles , Gorkem Durak , Yavuz Taktak , Onkar Susladkar , Vandan Gorade , Debesh Jha , Asli C. Ormeci , Alpay Medetalibeyoglu , Lanhong Yao , Bin Wang , Ilkin Sevgi Isler , Linkai Peng , Hongyi Pan , Camila Lopes Vendrami , Amir Bourhani , Yury Velichko , Boqing Gong , Concetto Spampinato , Ayis Pyrros , Ulas Bagci
{"title":"Large-scale multi-center CT and MRI segmentation of pancreas with deep learning","authors":"Zheyuan Zhang , Elif Keles , Gorkem Durak , Yavuz Taktak , Onkar Susladkar , Vandan Gorade , Debesh Jha , Asli C. Ormeci , Alpay Medetalibeyoglu , Lanhong Yao , Bin Wang , Ilkin Sevgi Isler , Linkai Peng , Hongyi Pan , Camila Lopes Vendrami , Amir Bourhani , Yury Velichko , Boqing Gong , Concetto Spampinato , Ayis Pyrros , Ulas Bagci","doi":"10.1016/j.media.2024.103382","DOIUrl":"10.1016/j.media.2024.103382","url":null,"abstract":"<div><div>Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective study, we collected a large dataset (767 scans from 499 participants) of T1-weighted (T1 W) and T2-weighted (T2 W) abdominal MRI series from five centers between March 2004 and November 2022. We also collected CT scans of 1,350 patients from publicly available sources for benchmarking purposes. We introduced a new pancreas segmentation method, called <em>PanSegNet</em>, combining the strengths of <em>nnUNet</em> and a <em>Transformer</em> network with a new linear attention module enabling volumetric computation. We tested <em>PanSegNet</em>’s accuracy in cross-modality (a total of 2,117 scans) and cross-center settings with Dice and Hausdorff distance (HD95) evaluation metrics. We used Cohen’s kappa statistics for intra and inter-rater agreement evaluation and paired t-tests for volume and Dice comparisons, respectively. For segmentation accuracy, we achieved Dice coefficients of 88.3% (±7.2%, at case level) with CT, 85.0% (±7.9%) with T1 W MRI, and 86.3% (±6.4%) with T2 W MRI. There was a high correlation for pancreas volume prediction with <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> of 0.91, 0.84, and 0.85 for CT, T1 W, and T2 W, respectively. We found moderate inter-observer (0.624 and 0.638 for T1 W and T2 W MRI, respectively) and high intra-observer agreement scores. All MRI data is made available at <span><span>https://osf.io/kysnj/</span><svg><path></path></svg></span>. Our source code is available at <span><span>https://github.com/NUBagciLab/PaNSegNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103382"},"PeriodicalIF":10.7,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142623298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro Esteban Chavarrias Solano , Andrew Bulpitt , Venkataraman Subramanian , Sharib Ali
{"title":"Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy","authors":"Pedro Esteban Chavarrias Solano , Andrew Bulpitt , Venkataraman Subramanian , Sharib Ali","doi":"10.1016/j.media.2024.103379","DOIUrl":"10.1016/j.media.2024.103379","url":null,"abstract":"<div><div>Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (<em>aka</em> depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mn>1</mn><mo>.</mo><mn>25</mn></mrow></msub></math></span> accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103379"},"PeriodicalIF":10.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142623301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixiao Mao , Qianjin Feng , Yu Zhang , Zhenyuan Ning
{"title":"Semantics and instance interactive learning for labeling and segmentation of vertebrae in CT images","authors":"Yixiao Mao , Qianjin Feng , Yu Zhang , Zhenyuan Ning","doi":"10.1016/j.media.2024.103380","DOIUrl":"10.1016/j.media.2024.103380","url":null,"abstract":"<div><div>Automatically labeling and segmenting vertebrae in 3D CT images compose a complex multi-task problem. Current methods progressively conduct vertebra labeling and semantic segmentation, which typically include two separate models and may ignore feature interaction among different tasks. Although instance segmentation approaches with multi-channel prediction have been proposed to alleviate such issues, their utilization of semantic information remains insufficient. Additionally, another challenge for an accurate model is how to effectively distinguish similar adjacent vertebrae and model their sequential attribute. In this paper, we propose a Semantics and Instance Interactive Learning (SIIL) paradigm for synchronous labeling and segmentation of vertebrae in CT images. SIIL models semantic feature learning and instance feature learning, in which the former extracts spinal semantics and the latter distinguishes vertebral instances. Interactive learning involves semantic features to improve the separability of vertebral instances and instance features to help learn position and contour information, during which a Morphological Instance Localization Learning (MILL) module is introduced to align semantic and instance features and facilitate their interaction. Furthermore, an Ordinal Contrastive Prototype Learning (OCPL) module is devised to differentiate adjacent vertebrae with high similarity (via cross-image contrastive learning), and simultaneously model their sequential attribute (via a temporal unit). Extensive experiments on several datasets demonstrate that our method significantly outperforms other approaches in labeling and segmenting vertebrae. Our code is available at <span><span>https://github.com/YuZhang-SMU/Vertebrae-Labeling-Segmentation</span><svg><path></path></svg></span></div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103380"},"PeriodicalIF":10.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142605193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qixiang Ma , Adrien Kaladji , Huazhong Shu , Guanyu Yang , Antoine Lucas , Pascal Haigron
{"title":"Beyond strong labels: Weakly-supervised learning based on Gaussian pseudo labels for the segmentation of ellipse-like vascular structures in non-contrast CTs","authors":"Qixiang Ma , Adrien Kaladji , Huazhong Shu , Guanyu Yang , Antoine Lucas , Pascal Haigron","doi":"10.1016/j.media.2024.103378","DOIUrl":"10.1016/j.media.2024.103378","url":null,"abstract":"<div><div>Deep learning-based automated segmentation of vascular structures in preoperative CT angiography (CTA) images contributes to computer-assisted diagnosis and interventions. While CTA is the common standard, non-contrast CT imaging has the advantage of avoiding complications associated with contrast agents. However, the challenges of labor-intensive labeling and high labeling variability due to the ambiguity of vascular boundaries hinder conventional strong-label-based, fully-supervised learning in non-contrast CTs. This paper introduces a novel weakly-supervised framework using the elliptical topology nature of vascular structures in CT slices. It includes an efficient annotation process based on our proposed standards, an approach of generating 2D Gaussian heatmaps serving as pseudo labels, and a training process through a combination of voxel reconstruction loss and distribution loss with the pseudo labels. We assess the effectiveness of the proposed method on one local and two public datasets comprising non-contrast CT scans, particularly focusing on the abdominal aorta. On the local dataset, our weakly-supervised learning approach based on pseudo labels outperforms strong-label-based fully-supervised learning (1.54% of Dice score on average), reducing labeling time by around 82.0%. The efficiency in generating pseudo labels allows the inclusion of label-agnostic external data in the training set, leading to an additional improvement in performance (2.74% of Dice score on average) with a reduction of 66.3% labeling time, where the labeling time remains considerably less than that of strong labels. On the public dataset, the pseudo labels achieve an overall improvement of 1.95% in Dice score for 2D models with a reduction of 68% of the Hausdorff distance for 3D model.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103378"},"PeriodicalIF":10.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}