Manahil Raza , Ruqayya Awan , Raja Muhammad Saad Bashir , Talha Qaiser , Nasir M. Rajpoot
{"title":"Dual attention model with reinforcement learning for classification of histology whole-slide images","authors":"Manahil Raza , Ruqayya Awan , Raja Muhammad Saad Bashir , Talha Qaiser , Nasir M. Rajpoot","doi":"10.1016/j.compmedimag.2024.102466","DOIUrl":"10.1016/j.compmedimag.2024.102466","url":null,"abstract":"<div><div>Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data (several billions of pixels per image). Directly feeding these images to deep learning models is computationally intractable due to memory constraints, while downsampling the WSIs risks incurring information loss. Alternatively, splitting the WSIs into smaller patches (or tiles) may result in a loss of important contextual information. In this paper, we propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist: The first <em>soft</em> attention model processes a low magnification view of the WSI to identify relevant regions of interest (ROIs), followed by a custom sampling method to extract diverse and spatially distinct image tiles from the selected ROIs. The second component, the <em>hard</em> attention classification model further extracts a sequence of multi-resolution glimpses from each tile for classification. Since hard attention is non-differentiable, we train this component using reinforcement learning to predict the location of the glimpses. This approach allows the model to focus on essential regions instead of processing the entire tile, thereby aligning with a pathologist’s way of diagnosis. The two components are trained in an end-to-end fashion using a joint loss function to demonstrate the efficacy of the model. The proposed model was evaluated on two WSI-level classification problems: Human epidermal growth factor receptor 2 (HER2) scoring on breast cancer histology images and prediction of Intact/Loss status of two Mismatch Repair (MMR) biomarkers from colorectal cancer histology images. We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification and reducing the time required to infer the WSI-level label by more than 75%. The code is available at <span><span>github</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102466"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mateo Gende , Joaquim de Moura , Patricia Robles , Jose Fernández-Vigo , José M. Martínez-de-la-Casa , GlaucoClub AI, Julián García-Feijóo , Jorge Novo , Marcos Ortega
{"title":"Circumpapillary OCT-based multi-sector analysis of retinal layer thickness in patients with glaucoma and high myopia","authors":"Mateo Gende , Joaquim de Moura , Patricia Robles , Jose Fernández-Vigo , José M. Martínez-de-la-Casa , GlaucoClub AI, Julián García-Feijóo , Jorge Novo , Marcos Ortega","doi":"10.1016/j.compmedimag.2024.102464","DOIUrl":"10.1016/j.compmedimag.2024.102464","url":null,"abstract":"<div><div>Glaucoma is the leading cause of irreversible blindness worldwide. The diagnosis process for glaucoma involves the measurement of the thickness of retinal layers in order to track its degeneration. The elongated shape of highly myopic eyes can hinder this diagnosis process, since it affects the OCT scanning process, producing deformations that can mimic or mask the degeneration caused by glaucoma. In this work, we present the first comprehensive cross-disease analysis that is focused on the anatomical structures most impacted in glaucoma and high myopia patients, facilitating precise differential diagnosis from those solely afflicted by myopia. To achieve this, a fully automatic approach for the retinal layer segmentation was specifically tailored for the accurate measurement of retinal thickness in both highly myopic and emmetropic eyes. To the best of our knowledge, this is the first approach proposed for the analysis of retinal layers in circumpapillary optical coherence tomography images that takes into account the elongation of the eyes in myopia, thus addressing critical diagnostic needs. The results from this study indicate that the temporal superior (mean difference <span><math><mrow><mn>11</mn><mo>.</mo><mn>1</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span>), nasal inferior (<span><math><mrow><mn>13</mn><mo>.</mo><mn>1</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span>) and temporal inferior (<span><math><mrow><mn>13</mn><mo>.</mo><mn>3</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span>) sectors of the retinal nerve fibre layer show the most significant reduction in retinal thickness in patients of glaucoma and myopia with regards to patients of myopia.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102464"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Imran , Jonathan R. Krebs , Veera Rajasekhar Reddy Gopu , Brian Fazzone , Vishal Balaji Sivaraman , Amarjeet Kumar , Chelsea Viscardi , Robert Evans Heithaus , Benjamin Shickel , Yuyin Zhou , Michol A. Cooper , Wei Shao
{"title":"CIS-UNet: Multi-class segmentation of the aorta in computed tomography angiography via context-aware shifted window self-attention","authors":"Muhammad Imran , Jonathan R. Krebs , Veera Rajasekhar Reddy Gopu , Brian Fazzone , Vishal Balaji Sivaraman , Amarjeet Kumar , Chelsea Viscardi , Robert Evans Heithaus , Benjamin Shickel , Yuyin Zhou , Michol A. Cooper , Wei Shao","doi":"10.1016/j.compmedimag.2024.102470","DOIUrl":"10.1016/j.compmedimag.2024.102470","url":null,"abstract":"<div><div>Advancements in medical imaging and endovascular grafting have facilitated minimally invasive treatments for aortic diseases. Accurate 3D segmentation of the aorta and its branches is crucial for interventions, as inaccurate segmentation can lead to erroneous surgical planning and endograft construction. Previous methods simplified aortic segmentation as a binary image segmentation problem, overlooking the necessity of distinguishing between individual aortic branches. In this paper, we introduce Context-Infused Swin-UNet (CIS-UNet), a deep learning model designed for multi-class segmentation of the aorta and thirteen aortic branches. Combining the strengths of Convolutional Neural Networks (CNNs) and Swin transformers, CIS-UNet adopts a hierarchical encoder–decoder structure comprising a CNN encoder, a symmetric decoder, skip connections, and a novel Context-aware Shifted Window Self-Attention (CSW-SA) module as the bottleneck block. Notably, CSW-SA introduces a unique adaptation of the patch merging layer, distinct from its traditional use in the Swin transformers. CSW-SA efficiently condenses the feature map, providing a global spatial context, and enhances performance when applied at the bottleneck layer, offering superior computational efficiency and segmentation accuracy compared to the Swin transformers. We evaluated our model on computed tomography (CT) scans from 59 patients through a 4-fold cross-validation. CIS-UNet outperformed the state-of-the-art Swin UNetR segmentation model by achieving a superior mean Dice coefficient of 0.732 compared to 0.717 and a mean surface distance of 2.40 mm compared to 2.75 mm. CIS-UNet’s superior 3D aortic segmentation offers improved accuracy and optimization for planning endovascular treatments. Our dataset and code will be made publicly available at <span><span>https://github.com/mirthAI/CIS-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102470"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingbin Wang , Yuxuan Xiong , Hanfeng Zhu , Xuefeng Mu , Yan Zhang , Yutao Ma
{"title":"Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer","authors":"Qingbin Wang , Yuxuan Xiong , Hanfeng Zhu , Xuefeng Mu , Yan Zhang , Yutao Ma","doi":"10.1016/j.compmedimag.2024.102469","DOIUrl":"10.1016/j.compmedimag.2024.102469","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images.</div></div><div><h3>Methods:</h3><div>In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder’s feature extraction task from the decoder’s reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data.</div></div><div><h3>Results:</h3><div>We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross-validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions.</div></div><div><h3>Conclusion:</h3><div>Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102469"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hongyi Chen , Junyan Fu , Xiao Liu , Zhiji Zheng , Xiao Luo , Kun Zhou , Zhijian Xu , Daoying Geng
{"title":"A Parkinson’s disease-related nuclei segmentation network based on CNN-Transformer interleaved encoder with feature fusion","authors":"Hongyi Chen , Junyan Fu , Xiao Liu , Zhiji Zheng , Xiao Luo , Kun Zhou , Zhijian Xu , Daoying Geng","doi":"10.1016/j.compmedimag.2024.102465","DOIUrl":"10.1016/j.compmedimag.2024.102465","url":null,"abstract":"<div><div>Automatic segmentation of Parkinson’s disease (PD) related deep gray matter (DGM) nuclei based on brain magnetic resonance imaging (MRI) is significant in assisting the diagnosis of PD. However, due to the degenerative-induced changes in appearance, low tissue contrast, and tiny DGM nuclei size in elders’ brain MRI images, many existing segmentation models are limited in the application. To address these challenges, this paper proposes a PD-related DGM nuclei segmentation network to provide precise prior knowledge for aiding diagnosis PD. The encoder of network is designed as an alternating encoding structure where the convolutional neural network (CNN) captures spatial and depth texture features, while the Transformer complements global position information between DGM nuclei. Moreover, we propose a cascaded channel-spatial-wise block to fuse features extracted by the CNN and Transformer, thereby achieving more precise DGM nuclei segmentation. The decoder incorporates a symmetrical boundary attention module, leveraging the symmetrical structures of bilateral nuclei regions by constructing signed distance maps for symmetric differences, which optimizes segmentation boundaries. Furthermore, we employ a dynamic adaptive region of interests weighted Dice loss to enhance sensitivity towards smaller structures, thereby improving segmentation accuracy. In qualitative analysis, our method achieved optimal average values for PD-related DGM nuclei (DSC: 0.854, IOU: 0.750, HD95: 1.691 mm, ASD: 0.195 mm). Experiments conducted on multi-center clinical datasets and public datasets demonstrate the good generalizability of the proposed method. Furthermore, a volumetric analysis of segmentation results reveals significant differences between HCs and PDs. Our method holds promise for assisting clinicians in the rapid and accurate diagnosis of PD, offering a practical method for the imaging analysis of neurodegenerative diseases.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102465"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142700984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanfeng Shi , Jiaqi Wei , Richu Jin , Jiaxin Peng , Xingyue Wang , Yan Hu , Xiaoqing Zhang , Jiang Liu
{"title":"Retinal structure guidance-and-adaption network for early Parkinson’s disease recognition based on OCT images","authors":"Hanfeng Shi , Jiaqi Wei , Richu Jin , Jiaxin Peng , Xingyue Wang , Yan Hu , Xiaoqing Zhang , Jiang Liu","doi":"10.1016/j.compmedimag.2024.102463","DOIUrl":"10.1016/j.compmedimag.2024.102463","url":null,"abstract":"<div><div>Parkinson’s disease (PD) is a leading neurodegenerative disease globally. Precise and objective PD diagnosis is significant for early intervention and treatment. Recent studies have shown significant correlations between retinal structure information and PD based on optical coherence tomography (OCT) images, providing another potential means for early PD recognition. However, how to exploit the retinal structure information (e.g., thickness and mean intensity) from different retinal layers to improve PD recognition performance has not been studied before. Motivated by the above observations, we first propose a structural prior knowledge extraction (SPKE) module to obtain the retinal structure feature maps; then, we develop a structure-guided-and-adaption attention (SGDA) module to fully leverage the potential of different retinal layers based on the extracted retinal structure feature maps. By embedding SPKE and SGDA modules at the low stage of deep neural networks (DNNs), a retinal structure-guided-and-adaption network (RSGA-Net) is constructed for early PD recognition based on OCT images. The extensive experiments on a clinical OCT-PD dataset demonstrate the superiority of RSGA-Net over state-of-the-art methods. Additionally, we provide a visual analysis to explain how retinal structure information affects the decision-making process of DNNs.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102463"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142722393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploratory analysis of Type B Aortic Dissection (TBAD) segmentation in 2D CTA images using various kernels","authors":"Ayman Abaid , Srinivas Ilancheran , Talha Iqbal , Niamh Hynes , Ihsan Ullah","doi":"10.1016/j.compmedimag.2024.102460","DOIUrl":"10.1016/j.compmedimag.2024.102460","url":null,"abstract":"<div><div>Type-B Aortic Dissection is a rare but fatal cardiovascular disease characterized by a tear in the inner layer of the aorta, affecting 3.5 per 100,000 individuals annually. In this work, we explore the feasibility of leveraging two-dimensional Convolutional Neural Network (CNN) models to perform accurate slice-by-slice segmentation of true lumen, false lumen and false lumen thrombus in Computed Tomography Angiography images. The study performed an exploratory analysis of three 2D U-Net models: the baseline 2D U-Net, a variant of U-Net with atrous convolutions, and a U-Net with a custom layer featuring a position-oriented, partially shared weighting scheme kernel. These models were trained and benchmarked against a state-of-the-art baseline 3D U-Net model. Overall, our U-Net with the VGG19 encoder architecture achieved the best performance score among all other models, with a mean Dice score of 80.48% and an IoU score of 72.93%. The segmentation results were also compared with the Segment Anything Model (SAM) and the UniverSeg models. Our findings indicate that our 2D U-Net models excel in false lumen and true lumen segmentation accuracy while achieving lower false lumen thrombus segmentation accuracy compared to the state-of-the-art 3D U-Net model. The study findings highlight the complexities involved in developing segmentation models, especially for cardiovascular medical images, and emphasize the importance of developing lightweight models for real-time decision-making to improve overall patient care.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102460"},"PeriodicalIF":5.4,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring transformer reliability in clinically significant prostate cancer segmentation: A comprehensive in-depth investigation","authors":"Gustavo Andrade-Miranda , Pedro Soto Vega , Kamilia Taguelmimt , Hong-Phuong Dang , Dimitris Visvikis , Julien Bert","doi":"10.1016/j.compmedimag.2024.102459","DOIUrl":"10.1016/j.compmedimag.2024.102459","url":null,"abstract":"<div><div>Despite the growing prominence of transformers in medical image segmentation, their application to clinically significant prostate cancer (csPCa) has been overlooked. Minimal attention has been paid to domain shift analysis and uncertainty assessment, critical for safely implementing computer-aided diagnosis (CAD) systems. Domain shift in medical imagery refers to differences between the data used to train a model and the data evaluated later, arising from variations in imaging equipment, protocols, patient populations, and acquisition noise. While recent models enhance in-domain performance, areas such as robustness and uncertainty estimation in out-of-domain distributions have received limited investigation, creating indecisiveness about model reliability. In contrast, our study addresses csPCa at voxel, lesion, and image levels, investigating models from traditional U-Net to cutting-edge transformers. We focus on four key points: robustness, calibration, out-of-distribution (OOD), and misclassification detection (MD). Findings show that transformer-based models exhibit enhanced robustness at image and lesion levels, both in and out of domain. However, this improvement is not fully translated to the voxel level, where Convolutional Neural Networks (CNNs) outperform in most robustness metrics. Regarding uncertainty, hybrid transformers and transformer encoders performed better, but this trend depends on misclassification or out-of-distribution tasks.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102459"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142683103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NACNet: A histology context-aware transformer graph convolution network for predicting treatment response to neoadjuvant chemotherapy in Triple Negative Breast Cancer","authors":"Qiang Li , George Teodoro , Yi Jiang , Jun Kong","doi":"10.1016/j.compmedimag.2024.102467","DOIUrl":"10.1016/j.compmedimag.2024.102467","url":null,"abstract":"<div><div>Neoadjuvant chemotherapy (NAC) response prediction for triple negative breast cancer (TNBC) patients is a challenging task clinically as it requires understanding complex histology interactions within the tumor microenvironment (TME). Digital whole slide images (WSIs) capture detailed tissue information, but their giga-pixel size necessitates computational methods based on multiple instance learning, which typically analyze small, isolated image tiles without the spatial context of the TME. To address this limitation and incorporate TME spatial histology interactions in predicting NAC response for TNBC patients, we developed a histology context-aware transformer graph convolution network (NACNet). Our deep learning method identifies the histopathological labels on individual image tiles from WSIs, constructs a spatial TME graph, and represents each node with features derived from tissue texture and social network analysis. It predicts NAC response using a transformer graph convolution network model enhanced with graph isomorphism network layers. We evaluate our method with WSIs of a cohort of TNBC patient (N=105) and compared its performance with multiple state-of-the-art machine learning and deep learning models, including both graph and non-graph approaches. Our NACNet achieves 90.0% accuracy, 96.0% sensitivity, 88.0% specificity, and an AUC of 0.82, through eight-fold cross-validation, outperforming baseline models. These comprehensive experimental results suggest that NACNet holds strong potential for stratifying TNBC patients by NAC response, thereby helping to prevent overtreatment, improve patient quality of life, reduce treatment cost, and enhance clinical outcomes, marking an important advancement toward personalized breast cancer treatment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102467"},"PeriodicalIF":5.4,"publicationDate":"2024-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142700926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu
{"title":"Self-supervised multi-modal feature fusion for predicting early recurrence of hepatocellular carcinoma","authors":"Sen Wang , Ying Zhao , Jiayi Li , Zongmin Yi , Jun Li , Can Zuo , Yu Yao , Ailian Liu","doi":"10.1016/j.compmedimag.2024.102457","DOIUrl":"10.1016/j.compmedimag.2024.102457","url":null,"abstract":"<div><div>Surgical resection stands as the primary treatment option for early-stage hepatocellular carcinoma (HCC) patients. Postoperative early recurrence (ER) is a significant factor contributing to the mortality of HCC patients. Therefore, accurately predicting the risk of ER after curative resection is crucial for clinical decision-making and improving patient prognosis. This study leverages a self-supervised multi-modal feature fusion approach, combining multi-phase MRI and clinical features, to predict ER of HCC. Specifically, we utilized attention mechanisms to suppress redundant features, enabling efficient extraction and fusion of multi-phase features. Through self-supervised learning (SSL), we pretrained an encoder on our dataset to extract more generalizable feature representations. Finally, we achieved effective multi-modal information fusion via attention modules. To enhance explainability, we employed Score-CAM to visualize the key regions influencing the model’s predictions. We evaluated the effectiveness of the proposed method on our dataset and found that predictions based on multi-phase feature fusion outperformed those based on single-phase features. Additionally, predictions based on multi-modal feature fusion were superior to those based on single-modal features.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102457"},"PeriodicalIF":5.4,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142689549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}