Vincent Roca , Grégory Kuchcinski , Jean-Pierre Pruvo , Dorian Manouvriez , Renaud Lopes , the Australian Imaging Biomarkers and Lifestyle flagship study of ageing , the Alzheimer’s Disease Neuroimage Initiative
{"title":"IGUANe: A 3D generalizable CycleGAN for multicenter harmonization of brain MR images","authors":"Vincent Roca , Grégory Kuchcinski , Jean-Pierre Pruvo , Dorian Manouvriez , Renaud Lopes , the Australian Imaging Biomarkers and Lifestyle flagship study of ageing , the Alzheimer’s Disease Neuroimage Initiative","doi":"10.1016/j.media.2024.103388","DOIUrl":"10.1016/j.media.2024.103388","url":null,"abstract":"<div><div>In MRI studies, the aggregation of imaging data from multiple acquisition sites enhances sample size but may introduce site-related variabilities that hinder consistency in subsequent analyses. Deep learning methods for image translation have emerged as a solution for harmonizing MR images across sites. In this study, we introduce IGUANe (Image Generation with Unified Adversarial Networks), an original 3D model that leverages the strengths of domain translation and straightforward application of style transfer methods for multicenter brain MR image harmonization. IGUANe extends CycleGAN by integrating an arbitrary number of domains for training through a many-to-one architecture. The framework based on domain pairs enables the implementation of sampling strategies that prevent confusion between site-related and biological variabilities. During inference, the model can be applied to any image, even from an unknown acquisition site, making it a universal generator for harmonization. Trained on a dataset comprising T1-weighted images from 11 different scanners, IGUANe was evaluated on data from unseen sites. The assessments included the transformation of MR images with traveling subjects, the preservation of pairwise distances between MR images within domains, the evolution of volumetric patterns related to age and Alzheimer’s disease (AD), and the performance in age regression and patient classification tasks. Comparisons with other harmonization and normalization methods suggest that IGUANe better preserves individual information in MR images and is more suitable for maintaining and reinforcing variabilities related to age and AD. Future studies may further assess IGUANe in other multicenter contexts, either using the same model or retraining it for applications to different image modalities. Codes and the trained IGUANe model are available at <span><span>https://github.com/RocaVincent/iguane_harmonization.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103388"},"PeriodicalIF":10.7,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142639278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zheyuan Zhang , Elif Keles , Gorkem Durak , Yavuz Taktak , Onkar Susladkar , Vandan Gorade , Debesh Jha , Asli C. Ormeci , Alpay Medetalibeyoglu , Lanhong Yao , Bin Wang , Ilkin Sevgi Isler , Linkai Peng , Hongyi Pan , Camila Lopes Vendrami , Amir Bourhani , Yury Velichko , Boqing Gong , Concetto Spampinato , Ayis Pyrros , Ulas Bagci
{"title":"Large-scale multi-center CT and MRI segmentation of pancreas with deep learning","authors":"Zheyuan Zhang , Elif Keles , Gorkem Durak , Yavuz Taktak , Onkar Susladkar , Vandan Gorade , Debesh Jha , Asli C. Ormeci , Alpay Medetalibeyoglu , Lanhong Yao , Bin Wang , Ilkin Sevgi Isler , Linkai Peng , Hongyi Pan , Camila Lopes Vendrami , Amir Bourhani , Yury Velichko , Boqing Gong , Concetto Spampinato , Ayis Pyrros , Ulas Bagci","doi":"10.1016/j.media.2024.103382","DOIUrl":"10.1016/j.media.2024.103382","url":null,"abstract":"<div><div>Automated volumetric segmentation of the pancreas on cross-sectional imaging is needed for diagnosis and follow-up of pancreatic diseases. While CT-based pancreatic segmentation is more established, MRI-based segmentation methods are understudied, largely due to a lack of publicly available datasets, benchmarking research efforts, and domain-specific deep learning methods. In this retrospective study, we collected a large dataset (767 scans from 499 participants) of T1-weighted (T1 W) and T2-weighted (T2 W) abdominal MRI series from five centers between March 2004 and November 2022. We also collected CT scans of 1,350 patients from publicly available sources for benchmarking purposes. We introduced a new pancreas segmentation method, called <em>PanSegNet</em>, combining the strengths of <em>nnUNet</em> and a <em>Transformer</em> network with a new linear attention module enabling volumetric computation. We tested <em>PanSegNet</em>’s accuracy in cross-modality (a total of 2,117 scans) and cross-center settings with Dice and Hausdorff distance (HD95) evaluation metrics. We used Cohen’s kappa statistics for intra and inter-rater agreement evaluation and paired t-tests for volume and Dice comparisons, respectively. For segmentation accuracy, we achieved Dice coefficients of 88.3% (±7.2%, at case level) with CT, 85.0% (±7.9%) with T1 W MRI, and 86.3% (±6.4%) with T2 W MRI. There was a high correlation for pancreas volume prediction with <span><math><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup></math></span> of 0.91, 0.84, and 0.85 for CT, T1 W, and T2 W, respectively. We found moderate inter-observer (0.624 and 0.638 for T1 W and T2 W MRI, respectively) and high intra-observer agreement scores. All MRI data is made available at <span><span>https://osf.io/kysnj/</span><svg><path></path></svg></span>. Our source code is available at <span><span>https://github.com/NUBagciLab/PaNSegNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103382"},"PeriodicalIF":10.7,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142623298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro Esteban Chavarrias Solano , Andrew Bulpitt , Venkataraman Subramanian , Sharib Ali
{"title":"Multi-task learning with cross-task consistency for improved depth estimation in colonoscopy","authors":"Pedro Esteban Chavarrias Solano , Andrew Bulpitt , Venkataraman Subramanian , Sharib Ali","doi":"10.1016/j.media.2024.103379","DOIUrl":"10.1016/j.media.2024.103379","url":null,"abstract":"<div><div>Colonoscopy screening is the gold standard procedure for assessing abnormalities in the colon and rectum, such as ulcers and cancerous polyps. Measuring the abnormal mucosal area and its 3D reconstruction can help quantify the surveyed area and objectively evaluate disease burden. However, due to the complex topology of these organs and variable physical conditions, for example, lighting, large homogeneous texture, and image modality estimating distance from the camera (<em>aka</em> depth) is highly challenging. Moreover, most colonoscopic video acquisition is monocular, making the depth estimation a non-trivial problem. While methods in computer vision for depth estimation have been proposed and advanced on natural scene datasets, the efficacy of these techniques has not been widely quantified on colonoscopy datasets. As the colonic mucosa has several low-texture regions that are not well pronounced, learning representations from an auxiliary task can improve salient feature extraction, allowing estimation of accurate camera depths. In this work, we propose to develop a novel multi-task learning (MTL) approach with a shared encoder and two decoders, namely a surface normal decoder and a depth estimator decoder. Our depth estimator incorporates attention mechanisms to enhance global context awareness. We leverage the surface normal prediction to improve geometric feature extraction. Also, we apply a cross-task consistency loss among the two geometrically related tasks, surface normal and camera depth. We demonstrate an improvement of 15.75% on relative error and 10.7% improvement on <span><math><msub><mrow><mi>δ</mi></mrow><mrow><mn>1</mn><mo>.</mo><mn>25</mn></mrow></msub></math></span> accuracy over the most accurate baseline state-of-the-art Big-to-Small (BTS) approach. All experiments are conducted on a recently released C3VD dataset, and thus, we provide a first benchmark of state-of-the-art methods on this dataset.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103379"},"PeriodicalIF":10.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142623301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yixiao Mao , Qianjin Feng , Yu Zhang , Zhenyuan Ning
{"title":"Semantics and instance interactive learning for labeling and segmentation of vertebrae in CT images","authors":"Yixiao Mao , Qianjin Feng , Yu Zhang , Zhenyuan Ning","doi":"10.1016/j.media.2024.103380","DOIUrl":"10.1016/j.media.2024.103380","url":null,"abstract":"<div><div>Automatically labeling and segmenting vertebrae in 3D CT images compose a complex multi-task problem. Current methods progressively conduct vertebra labeling and semantic segmentation, which typically include two separate models and may ignore feature interaction among different tasks. Although instance segmentation approaches with multi-channel prediction have been proposed to alleviate such issues, their utilization of semantic information remains insufficient. Additionally, another challenge for an accurate model is how to effectively distinguish similar adjacent vertebrae and model their sequential attribute. In this paper, we propose a Semantics and Instance Interactive Learning (SIIL) paradigm for synchronous labeling and segmentation of vertebrae in CT images. SIIL models semantic feature learning and instance feature learning, in which the former extracts spinal semantics and the latter distinguishes vertebral instances. Interactive learning involves semantic features to improve the separability of vertebral instances and instance features to help learn position and contour information, during which a Morphological Instance Localization Learning (MILL) module is introduced to align semantic and instance features and facilitate their interaction. Furthermore, an Ordinal Contrastive Prototype Learning (OCPL) module is devised to differentiate adjacent vertebrae with high similarity (via cross-image contrastive learning), and simultaneously model their sequential attribute (via a temporal unit). Extensive experiments on several datasets demonstrate that our method significantly outperforms other approaches in labeling and segmenting vertebrae. Our code is available at <span><span>https://github.com/YuZhang-SMU/Vertebrae-Labeling-Segmentation</span><svg><path></path></svg></span></div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103380"},"PeriodicalIF":10.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142605193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qixiang Ma , Adrien Kaladji , Huazhong Shu , Guanyu Yang , Antoine Lucas , Pascal Haigron
{"title":"Beyond strong labels: Weakly-supervised learning based on Gaussian pseudo labels for the segmentation of ellipse-like vascular structures in non-contrast CTs","authors":"Qixiang Ma , Adrien Kaladji , Huazhong Shu , Guanyu Yang , Antoine Lucas , Pascal Haigron","doi":"10.1016/j.media.2024.103378","DOIUrl":"10.1016/j.media.2024.103378","url":null,"abstract":"<div><div>Deep learning-based automated segmentation of vascular structures in preoperative CT angiography (CTA) images contributes to computer-assisted diagnosis and interventions. While CTA is the common standard, non-contrast CT imaging has the advantage of avoiding complications associated with contrast agents. However, the challenges of labor-intensive labeling and high labeling variability due to the ambiguity of vascular boundaries hinder conventional strong-label-based, fully-supervised learning in non-contrast CTs. This paper introduces a novel weakly-supervised framework using the elliptical topology nature of vascular structures in CT slices. It includes an efficient annotation process based on our proposed standards, an approach of generating 2D Gaussian heatmaps serving as pseudo labels, and a training process through a combination of voxel reconstruction loss and distribution loss with the pseudo labels. We assess the effectiveness of the proposed method on one local and two public datasets comprising non-contrast CT scans, particularly focusing on the abdominal aorta. On the local dataset, our weakly-supervised learning approach based on pseudo labels outperforms strong-label-based fully-supervised learning (1.54% of Dice score on average), reducing labeling time by around 82.0%. The efficiency in generating pseudo labels allows the inclusion of label-agnostic external data in the training set, leading to an additional improvement in performance (2.74% of Dice score on average) with a reduction of 66.3% labeling time, where the labeling time remains considerably less than that of strong labels. On the public dataset, the pseudo labels achieve an overall improvement of 1.95% in Dice score for 2D models with a reduction of 68% of the Hausdorff distance for 3D model.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103378"},"PeriodicalIF":10.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kimberly Amador , Noah Pinel , Anthony J. Winder , Jens Fiehler , Matthias Wilms , Nils D. Forkert
{"title":"A cross-attention-based deep learning approach for predicting functional stroke outcomes using 4D CTP imaging and clinical metadata","authors":"Kimberly Amador , Noah Pinel , Anthony J. Winder , Jens Fiehler , Matthias Wilms , Nils D. Forkert","doi":"10.1016/j.media.2024.103381","DOIUrl":"10.1016/j.media.2024.103381","url":null,"abstract":"<div><div>Acute ischemic stroke (AIS) remains a global health challenge, leading to long-term functional disabilities without timely intervention. Spatio-temporal (4D) Computed Tomography Perfusion (CTP) imaging is crucial for diagnosing and treating AIS due to its ability to rapidly assess the ischemic core and penumbra. Although traditionally used to assess acute tissue status in clinical settings, 4D CTP has also been explored in research for predicting stroke tissue outcomes. However, its potential for predicting functional outcomes, especially in combination with clinical metadata, remains unexplored. Thus, this work aims to develop and evaluate a novel multimodal deep learning model for predicting functional outcomes (specifically, 90-day modified Rankin Scale) in AIS patients by combining 4D CTP and clinical metadata. To achieve this, an intermediate fusion strategy with a cross-attention mechanism is introduced to enable a selective focus on the most relevant features and patterns from both modalities. Evaluated on a dataset comprising 70 AIS patients who underwent endovascular mechanical thrombectomy, the proposed model achieves an accuracy (ACC) of 0.77, outperforming conventional late fusion strategies (ACC = 0.73) and unimodal models based on either 4D CTP (ACC = 0.61) or clinical metadata (ACC = 0.71). The results demonstrate the superior capability of the proposed model to leverage complex inter-modal relationships, emphasizing the value of advanced multimodal fusion techniques for predicting functional stroke outcomes.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103381"},"PeriodicalIF":10.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142578930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lanzhuju Mei , Ke Deng , Zhiming Cui , Yu Fang , Yuan Li , Hongchang Lai , Maurizio S. Tonetti , Dinggang Shen
{"title":"Clinical knowledge-guided hybrid classification network for automatic periodontal disease diagnosis in X-ray image","authors":"Lanzhuju Mei , Ke Deng , Zhiming Cui , Yu Fang , Yuan Li , Hongchang Lai , Maurizio S. Tonetti , Dinggang Shen","doi":"10.1016/j.media.2024.103376","DOIUrl":"10.1016/j.media.2024.103376","url":null,"abstract":"<div><div>Accurate classification of periodontal disease through panoramic X-ray images carries immense clinical importance for effective diagnosis and treatment. Recent methodologies attempt to classify periodontal diseases from X-ray images by estimating bone loss within these images, supervised by manual radiographic annotations for segmentation or keypoint detection. However, these annotations often lack consistency with the clinical gold standard of probing measurements, potentially causing measurement inaccuracy and leading to unstable classifications. Additionally, the diagnosis of periodontal disease necessitates exceptional sensitivity. To address these challenges, we introduce HC-Net, an innovative hybrid classification framework devised for accurately classifying periodontal disease from X-ray images. This framework comprises three main components: tooth-level classification, patient-level classification, and a learnable adaptive noisy-OR gate. In the tooth-level classification, we initially employ instance segmentation to individually identify each tooth, followed by tooth-level periodontal disease classification. For patient-level classification, we utilize a multi-task strategy to concurrently learn patient-level classification and a Classification Activation Map (CAM) that signifies the confidence of local lesion areas within the panoramic X-ray image. Eventually, our adaptive noisy-OR gate acquires a hybrid classification by amalgamating predictions from both levels. In particular, we incorporate clinical knowledge into the workflows used by professional dentists, targeting the enhanced handling of sensitivity of periodontal disease diagnosis. Extensive empirical testing on a dataset amassed from real-world clinics demonstrates that our proposed HC-Net achieves unparalleled performance in periodontal disease classification, exhibiting substantial potential for practical application.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103376"},"PeriodicalIF":10.7,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142623295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DACG: Dual Attention and Context Guidance model for radiology report generation","authors":"Wangyu Lang, Zhi Liu, Yijia Zhang","doi":"10.1016/j.media.2024.103377","DOIUrl":"10.1016/j.media.2024.103377","url":null,"abstract":"<div><div>Medical images are an essential basis for radiologists to write radiology reports and greatly help subsequent clinical treatment. The task of generating automatic radiology reports aims to alleviate the burden of clinical doctors writing reports and has received increasing attention this year, becoming an important research hotspot. However, there are severe issues of visual and textual data bias and long text generation in the medical field. Firstly, Abnormal areas in radiological images only account for a small portion, and most radiological reports only involve descriptions of normal findings. Secondly, there are still significant challenges in generating longer and more accurate descriptive texts for radiology report generation tasks. In this paper, we propose a new Dual Attention and Context Guidance (DACG) model to alleviate visual and textual data bias and promote the generation of long texts. We use a Dual Attention Module, including a Position Attention Block and a Channel Attention Block, to extract finer position and channel features from medical images, enhancing the image feature extraction ability of the encoder. We use the Context Guidance Module to integrate contextual information into the decoder and supervise the generation of long texts. The experimental results show that our proposed model achieves state-of-the-art performance on the most commonly used IU X-ray and MIMIC-CXR datasets. Further analysis also proves that our model can improve reporting through more accurate anomaly detection and more detailed descriptions. The source code is available at <span><span>https://github.com/LangWY/DACG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103377"},"PeriodicalIF":10.7,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142554249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tomás Banduc , Luca Azzolin , Martin Manninger , Daniel Scherr , Gernot Plank , Simone Pezzuto , Francisco Sahli Costabal
{"title":"Simulation-free prediction of atrial fibrillation inducibility with the fibrotic kernel signature","authors":"Tomás Banduc , Luca Azzolin , Martin Manninger , Daniel Scherr , Gernot Plank , Simone Pezzuto , Francisco Sahli Costabal","doi":"10.1016/j.media.2024.103375","DOIUrl":"10.1016/j.media.2024.103375","url":null,"abstract":"<div><div>Computational models of atrial fibrillation (AF) can help improve success rates of interventions, such as ablation. However, evaluating the efficacy of different treatments requires performing multiple costly simulations by pacing at different points and checking whether AF has been induced or not, hindering the clinical application of these models. In this work, we propose a classification method that can predict AF inducibility in patient-specific cardiac models without running additional simulations. Our methodology does not require re-training when changing atrial anatomy or fibrotic patterns. To achieve this, we develop a set of features given by a variant of the heat kernel signature that incorporates fibrotic pattern information and fiber orientations: the fibrotic kernel signature (FKS). The FKS is faster to compute than a single AF simulation, and when paired with machine learning classifiers, it can predict AF inducibility in the entire domain. To learn the relationship between the FKS and AF inducibility, we performed 2371 AF simulations comprising 6 different anatomies and various fibrotic patterns, which we split into training and a testing set. We obtain a median F1 score of 85.2% in test set and we can predict the overall inducibility with a mean absolute error of 2.76 percent points, which is lower than alternative methods. We think our method can significantly speed-up the calculations of AF inducibility, which is crucial to optimize therapies for AF within clinical timelines. An example of the FKS for an open source model is provided in <span><span>https://github.com/tbanduc/FKS_AtrialModel_Ferrer.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103375"},"PeriodicalIF":10.7,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142546321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sharib Ali , Yamid Espinel , Yueming Jin , Peng Liu , Bianca Güttner , Xukun Zhang , Lihua Zhang , Tom Dowrick , Matthew J. Clarkson , Shiting Xiao , Yifan Wu , Yijun Yang , Lei Zhu , Dai Sun , Lan Li , Micha Pfeiffer , Shahid Farid , Lena Maier-Hein , Emmanuel Buc , Adrien Bartoli
{"title":"An objective comparison of methods for augmented reality in laparoscopic liver resection by preoperative-to-intraoperative image fusion from the MICCAI2022 challenge","authors":"Sharib Ali , Yamid Espinel , Yueming Jin , Peng Liu , Bianca Güttner , Xukun Zhang , Lihua Zhang , Tom Dowrick , Matthew J. Clarkson , Shiting Xiao , Yifan Wu , Yijun Yang , Lei Zhu , Dai Sun , Lan Li , Micha Pfeiffer , Shahid Farid , Lena Maier-Hein , Emmanuel Buc , Adrien Bartoli","doi":"10.1016/j.media.2024.103371","DOIUrl":"10.1016/j.media.2024.103371","url":null,"abstract":"<div><div>Augmented reality for laparoscopic liver resection is a visualisation mode that allows a surgeon to localise tumours and vessels embedded within the liver by projecting them on top of a laparoscopic image. Preoperative 3D models extracted from Computed Tomography (CT) or Magnetic Resonance (MR) imaging data are registered to the intraoperative laparoscopic images during this process. Regarding 3D–2D fusion, most algorithms use anatomical landmarks to guide registration, such as the liver’s inferior ridge, the falciform ligament, and the occluding contours. These are usually marked by hand in both the laparoscopic image and the 3D model, which is time-consuming and prone to error. Therefore, there is a need to automate this process so that augmented reality can be used effectively in the operating room. We present the Preoperative-to-Intraoperative Laparoscopic Fusion challenge (P2ILF), held during the Medical Image Computing and Computer Assisted Intervention (MICCAI 2022) conference, which investigates the possibilities of detecting these landmarks automatically and using them in registration. The challenge was divided into two tasks: (1) A 2D and 3D landmark segmentation task and (2) a 3D–2D registration task. The teams were provided with training data consisting of 167 laparoscopic images and 9 preoperative 3D models from 9 patients, with the corresponding 2D and 3D landmark annotations. A total of 6 teams from 4 countries participated in the challenge, whose results were assessed for each task independently. All the teams proposed deep learning-based methods for the 2D and 3D landmark segmentation tasks and differentiable rendering-based methods for the registration task. The proposed methods were evaluated on 16 test images and 2 preoperative 3D models from 2 patients. In Task 1, the teams were able to segment most of the 2D landmarks, while the 3D landmarks showed to be more challenging to segment. In Task 2, only one team obtained acceptable qualitative and quantitative registration results. Based on the experimental outcomes, we propose three key hypotheses that determine current limitations and future directions for research in this domain.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103371"},"PeriodicalIF":10.7,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142564711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}