Oriane Thiery , Mira Rizkallah , Clément Bailly , Caroline Bodet-Milin , Emmanuel Itti , René-Olivier Casasnovas , Steven Le Gouill , Thomas Carlier , Diana Mateus
{"title":"PET-based lesion graphs meet clinical data: An interpretable cross-attention framework for DLBCL treatment response prediction","authors":"Oriane Thiery , Mira Rizkallah , Clément Bailly , Caroline Bodet-Milin , Emmanuel Itti , René-Olivier Casasnovas , Steven Le Gouill , Thomas Carlier , Diana Mateus","doi":"10.1016/j.compmedimag.2024.102481","DOIUrl":"10.1016/j.compmedimag.2024.102481","url":null,"abstract":"<div><div>Diffuse Large B-cell Lymphoma (DLBCL) is a lymphatic cancer of steadily growing incidence. Its diagnostic and follow-up rely on the analysis of clinical biomarkers and 18F-Fluorodeoxyglucose (FDG)-PET/CT images. In this context, we target the problem of assisting in the early identification of high-risk DLBCL patients from both images and tabular clinical data. We propose a solution based on a graph neural network model, capable of simultaneously modeling the variable number of lesions across patients, and fusing information from both data modalities and over lesions. Given the distributed nature of DLBCL lesions, we represent the PET image of each patient as an attributed lesion graph. Such lesion-graphs keep all relevant image information while offering a compact tradeoff between the characterization of full images and single lesions. We also design a cross-attention module to fuse the image attributes with clinical indicators, which is particularly challenging given the large difference in dimensionality and prognostic strength of each modality. To this end, we propose several cross-attention configurations, discuss the implications of each design, and experimentally compare their performances. The last module fuses the updated attributes across lesions and makes a probabilistic prediction of the patient’s 2-year progression-free survival (PFS). We carry out the experimental validation of our proposed framework on a prospective multicentric dataset of 545 patients. Experimental results show our framework effectively integrates the multi-lesion image information improving over a model relying only on the most prognostic clinical data. The analysis further shows the interpretable properties inherent to our graph-based design, which enables tracing the decision back to the most important lesions and features.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"120 ","pages":"Article 102481"},"PeriodicalIF":5.4,"publicationDate":"2024-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143015673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"General retinal layer segmentation in OCT images via reinforcement constraint","authors":"Jinbao Hao, Huiqi Li, Shuai Lu, Zeheng Li, Weihang Zhang","doi":"10.1016/j.compmedimag.2024.102480","DOIUrl":"10.1016/j.compmedimag.2024.102480","url":null,"abstract":"<div><div>The change of layer thickness of retina is closely associated with the development of ocular diseases such as glaucoma and optic disc drusen. Optical coherence tomography (OCT) is a widely used technology to visualize the lamellar structures of retina. Accurate segmentation of retinal lamellar structures is crucial for diagnosis, treatment, and related research of ocular diseases. However, existing studies have focused on improving the segmentation accuracy, they cannot achieve consistent segmentation performance on different types of datasets, such as retinal OCT images with optic disc and interference of diseases. To this end, a general retinal layer segmentation method is presented in this paper. To obtain more continuous and smoother boundaries, feature enhanced decoding module with reinforcement constraint is proposed, fusing boundary prior and distribution prior, and correcting bias in learning process simultaneously. To enhance the model’s perception of the slender retinal structure, position channel attention is introduced, obtaining global dependencies of both space and channel. To handle the imbalanced distribution of retinal OCT images, focal loss is introduced, guiding the model to pay more attention to retinal layers with a smaller proportion. The designed method achieves the state-of-the-art (SOTA) overall performance on five datasets (i.e., MGU, DUKE, NR206, OCTA500 and private dataset).</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"120 ","pages":"Article 102480"},"PeriodicalIF":5.4,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142933309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chihao Gong , Yinglan Wu , Guangyuan Zhang , Xuan Liu , Xiaoyao Zhu , Nian Cai , Jian Li
{"title":"Computer-assisted diagnosis for axillary lymph node metastasis of early breast cancer based on transformer with dual-modal adaptive mid-term fusion using ultrasound elastography","authors":"Chihao Gong , Yinglan Wu , Guangyuan Zhang , Xuan Liu , Xiaoyao Zhu , Nian Cai , Jian Li","doi":"10.1016/j.compmedimag.2024.102472","DOIUrl":"10.1016/j.compmedimag.2024.102472","url":null,"abstract":"<div><div>Accurate preoperative qualitative assessment of axillary lymph node metastasis (ALNM) in early breast cancer patients is crucial for precise clinical staging and selection of axillary treatment strategies. Although previous studies have introduced artificial intelligence (AI) to enhance the assessment performance of ALNM, they all focus on the prediction performances of their AI models and neglect the clinical assistance to the radiologists, which brings some issues to the clinical practice. To this end, we propose a human–AI collaboration strategy for ALNM diagnosis of early breast cancer, in which a novel deep learning framework, termed DAMF-former, is designed to assist radiologists in evaluating ALNM. Specifically, the DAMF-former focuses on the axillary region rather than the primary tumor area in previous studies. To mimic the radiologists’ alternative integration of the UE images of the target axillary lymph nodes for comprehensive analysis, adaptive mid-term fusion is proposed to alternatively extract and adaptively fuse the high-level features from the dual-modal UE images (i.e., B-mode ultrasound and Shear Wave Elastography). To further improve the diagnostic outcome of the DAMF-former, an adaptive Youden index scheme is proposed to deal with the fully fused dual-modal UE image features at the end of the framework, which can balance the diagnostic performance in terms of sensitivity and specificity. The clinical experiment indicates that the designed DAMF-former can assist and improve the diagnostic abilities of less-experienced radiologists for ALNM. Especially, the junior radiologists can significantly improve the diagnostic outcome from 0.807 AUC [95% CI: 0.781, 0.830] to 0.883 AUC [95% CI: 0.861, 0.902] (<span><math><mi>P</mi></math></span>-value <span><math><mo><</mo></math></span>0.0001). Moreover, there are great agreements among radiologists of different levels when assisted by the DAMF-former (Kappa value ranging from 0.805 to 0.895; <span><math><mi>P</mi></math></span>-value <span><math><mo><</mo></math></span>0.0001), suggesting that less-experienced radiologists can potentially achieve a diagnostic level similar to that of experienced radiologists through human–AI collaboration. This study explores a potential solution to human–AI collaboration for ALNM diagnosis based on UE images.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102472"},"PeriodicalIF":5.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Burak Kucukgoz , Ke Zou , Declan C. Murphy , David H. Steel , Boguslaw Obara , Huazhu Fu
{"title":"Uncertainty-aware regression model to predict post-operative visual acuity in patients with macular holes","authors":"Burak Kucukgoz , Ke Zou , Declan C. Murphy , David H. Steel , Boguslaw Obara , Huazhu Fu","doi":"10.1016/j.compmedimag.2024.102461","DOIUrl":"10.1016/j.compmedimag.2024.102461","url":null,"abstract":"<div><div>Full-thickness macular holes are a relatively common and visually disabling condition with a prevalence of approximately 0.5% in the over-40-year-old age group. If left untreated, the hole typically enlarges, reducing visual acuity (VA) below the definition of blindness in the eye affected. They are now routinely treated with surgery, which can close the hole and improve vision in most cases. The extent of improvement, however, is variable and dependent on the size of the hole and other features which can be discerned in spectral-domain optical coherence tomography imaging, which is now routinely available in eye clinics globally. Artificial intelligence (AI) models have been developed to enable surgical decision-making and have achieved relatively high predictive performance. However, their black-box behavior is opaque to users and uncertainty associated with their predictions is not typically stated, leading to a lack of trust among clinicians and patients. In this paper, we describe an uncertainty-aware regression model (U-ARM) for predicting VA for people undergoing macular hole surgery using preoperative spectral-domain optical coherence tomography images, achieving an MAE of 6.07, RMSE of 9.11 and R2 of 0.47 in internal tests, and an MAE of 6.49, RMSE of 9.49, and R2 of 0.42 in external tests. In addition to predicting VA following surgery, U-ARM displays its associated uncertainty, a <span><math><mi>p</mi></math></span>-value of <0.005 in internal and external tests, showing the predictions are not due to random chance. We then qualitatively evaluated the performance of U-ARM. Lastly, we demonstrate out-of-sample data performance, generalizing well to data outside the training distribution, low-quality images, and unseen instances not encountered during training. The results show that U-ARM outperforms commonly used methods in terms of prediction and reliability. U-ARM is thus a promising approach for clinical settings and can improve the reliability of AI models in predicting VA.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102461"},"PeriodicalIF":5.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142743193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zongyou Cai , Zhangnan Zhong , Haiwei Lin , Bingsheng Huang , Ziyue Xu , Bin Huang , Wei Deng , Qiting Wu , Kaixin Lei , Jiegeng Lyu , Yufeng Ye , Hanwei Chen , Jian Zhang
{"title":"Self-supervised learning on dual-sequence magnetic resonance imaging for automatic segmentation of nasopharyngeal carcinoma","authors":"Zongyou Cai , Zhangnan Zhong , Haiwei Lin , Bingsheng Huang , Ziyue Xu , Bin Huang , Wei Deng , Qiting Wu , Kaixin Lei , Jiegeng Lyu , Yufeng Ye , Hanwei Chen , Jian Zhang","doi":"10.1016/j.compmedimag.2024.102471","DOIUrl":"10.1016/j.compmedimag.2024.102471","url":null,"abstract":"<div><div>Automating the segmentation of nasopharyngeal carcinoma (NPC) is crucial for therapeutic procedures but presents challenges given the hurdles in amassing extensively annotated datasets. Although previous studies have applied self-supervised learning to capitalize on unlabeled data to improve segmentation performance, these methods often overlooked the benefits of dual-sequence magnetic resonance imaging (MRI). In the present study, we incorporated self-supervised learning with a saliency transformation module using unlabeled dual-sequence MRI for accurate NPC segmentation. 44 labeled and 72 unlabeled patients were collected to develop and evaluate our network. Impressively, our network achieved a mean Dice similarity coefficient (DSC) of 0.77, which is consistent with a previous study that relied on a training set of 4,100 annotated cases. The results further revealed that our approach required minimal adjustments, primarily <span><math><mo><</mo></math></span> 20% tweak in the DSC, to meet clinical standards. By enhancing the automatic segmentation of NPC, our method alleviates the annotation burden on oncologists, curbs subjectivity, and ensures reliable NPC delineation.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102471"},"PeriodicalIF":5.4,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142722395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mengkun Chen , Yen-Tung Liu , Fadeel Sher Khan , Matthew C. Fox , Jason S. Reichenberg , Fabiana C.P.S. Lopes , Katherine R. Sebastian , Mia K. Markey , James W. Tunnell
{"title":"Single color digital H&E staining with In-and-Out Net","authors":"Mengkun Chen , Yen-Tung Liu , Fadeel Sher Khan , Matthew C. Fox , Jason S. Reichenberg , Fabiana C.P.S. Lopes , Katherine R. Sebastian , Mia K. Markey , James W. Tunnell","doi":"10.1016/j.compmedimag.2024.102468","DOIUrl":"10.1016/j.compmedimag.2024.102468","url":null,"abstract":"<div><div>Digital staining streamlines traditional staining procedures by digitally generating stained images from unstained or differently stained images. While conventional staining methods involve time-consuming chemical processes, digital staining offers an efficient and low-infrastructure alternative. Researchers can expedite tissue analysis without physical sectioning by leveraging microscopy-based techniques, such as confocal microscopy. However, interpreting grayscale or pseudo-color microscopic images remains challenging for pathologists and surgeons accustomed to traditional histologically stained images. To fill this gap, various studies explore digitally simulating staining to mimic targeted histological stains. This paper introduces a novel network, In-and-Out Net, designed explicitly for digital staining tasks. Based on Generative Adversarial Networks (GAN), our model efficiently transforms Reflectance Confocal Microscopy (RCM) images into Hematoxylin and Eosin (H&E) stained images. Using aluminum chloride preprocessing for skin tissue, we enhance nuclei contrast in RCM images. We trained the model with digital H&E labels featuring two fluorescence channels, eliminating the need for image registration and providing pixel-level ground truth. Our contributions include proposing an optimal training strategy, conducting a comparative analysis demonstrating state-of-the-art performance, validating the model through an ablation study, and collecting perfectly matched input and ground truth images without registration. In-and-Out Net showcases promising results, offering a valuable tool for digital staining tasks and advancing the field of histological image analysis.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102468"},"PeriodicalIF":5.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Manahil Raza , Ruqayya Awan , Raja Muhammad Saad Bashir , Talha Qaiser , Nasir M. Rajpoot
{"title":"Dual attention model with reinforcement learning for classification of histology whole-slide images","authors":"Manahil Raza , Ruqayya Awan , Raja Muhammad Saad Bashir , Talha Qaiser , Nasir M. Rajpoot","doi":"10.1016/j.compmedimag.2024.102466","DOIUrl":"10.1016/j.compmedimag.2024.102466","url":null,"abstract":"<div><div>Digital whole slide images (WSIs) are generally captured at microscopic resolution and encompass extensive spatial data (several billions of pixels per image). Directly feeding these images to deep learning models is computationally intractable due to memory constraints, while downsampling the WSIs risks incurring information loss. Alternatively, splitting the WSIs into smaller patches (or tiles) may result in a loss of important contextual information. In this paper, we propose a novel dual attention approach, consisting of two main components, both inspired by the visual examination process of a pathologist: The first <em>soft</em> attention model processes a low magnification view of the WSI to identify relevant regions of interest (ROIs), followed by a custom sampling method to extract diverse and spatially distinct image tiles from the selected ROIs. The second component, the <em>hard</em> attention classification model further extracts a sequence of multi-resolution glimpses from each tile for classification. Since hard attention is non-differentiable, we train this component using reinforcement learning to predict the location of the glimpses. This approach allows the model to focus on essential regions instead of processing the entire tile, thereby aligning with a pathologist’s way of diagnosis. The two components are trained in an end-to-end fashion using a joint loss function to demonstrate the efficacy of the model. The proposed model was evaluated on two WSI-level classification problems: Human epidermal growth factor receptor 2 (HER2) scoring on breast cancer histology images and prediction of Intact/Loss status of two Mismatch Repair (MMR) biomarkers from colorectal cancer histology images. We show that the proposed model achieves performance better than or comparable to the state-of-the-art methods while processing less than 10% of the WSI at the highest magnification and reducing the time required to infer the WSI-level label by more than 75%. The code is available at <span><span>github</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102466"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142695994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mateo Gende , Joaquim de Moura , Patricia Robles , Jose Fernández-Vigo , José M. Martínez-de-la-Casa , GlaucoClub AI, Julián García-Feijóo , Jorge Novo , Marcos Ortega
{"title":"Circumpapillary OCT-based multi-sector analysis of retinal layer thickness in patients with glaucoma and high myopia","authors":"Mateo Gende , Joaquim de Moura , Patricia Robles , Jose Fernández-Vigo , José M. Martínez-de-la-Casa , GlaucoClub AI, Julián García-Feijóo , Jorge Novo , Marcos Ortega","doi":"10.1016/j.compmedimag.2024.102464","DOIUrl":"10.1016/j.compmedimag.2024.102464","url":null,"abstract":"<div><div>Glaucoma is the leading cause of irreversible blindness worldwide. The diagnosis process for glaucoma involves the measurement of the thickness of retinal layers in order to track its degeneration. The elongated shape of highly myopic eyes can hinder this diagnosis process, since it affects the OCT scanning process, producing deformations that can mimic or mask the degeneration caused by glaucoma. In this work, we present the first comprehensive cross-disease analysis that is focused on the anatomical structures most impacted in glaucoma and high myopia patients, facilitating precise differential diagnosis from those solely afflicted by myopia. To achieve this, a fully automatic approach for the retinal layer segmentation was specifically tailored for the accurate measurement of retinal thickness in both highly myopic and emmetropic eyes. To the best of our knowledge, this is the first approach proposed for the analysis of retinal layers in circumpapillary optical coherence tomography images that takes into account the elongation of the eyes in myopia, thus addressing critical diagnostic needs. The results from this study indicate that the temporal superior (mean difference <span><math><mrow><mn>11</mn><mo>.</mo><mn>1</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span>), nasal inferior (<span><math><mrow><mn>13</mn><mo>.</mo><mn>1</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span>) and temporal inferior (<span><math><mrow><mn>13</mn><mo>.</mo><mn>3</mn><mspace></mspace><mi>μ</mi><mi>m</mi></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>01</mn></mrow></math></span>) sectors of the retinal nerve fibre layer show the most significant reduction in retinal thickness in patients of glaucoma and myopia with regards to patients of myopia.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102464"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Muhammad Imran , Jonathan R. Krebs , Veera Rajasekhar Reddy Gopu , Brian Fazzone , Vishal Balaji Sivaraman , Amarjeet Kumar , Chelsea Viscardi , Robert Evans Heithaus , Benjamin Shickel , Yuyin Zhou , Michol A. Cooper , Wei Shao
{"title":"CIS-UNet: Multi-class segmentation of the aorta in computed tomography angiography via context-aware shifted window self-attention","authors":"Muhammad Imran , Jonathan R. Krebs , Veera Rajasekhar Reddy Gopu , Brian Fazzone , Vishal Balaji Sivaraman , Amarjeet Kumar , Chelsea Viscardi , Robert Evans Heithaus , Benjamin Shickel , Yuyin Zhou , Michol A. Cooper , Wei Shao","doi":"10.1016/j.compmedimag.2024.102470","DOIUrl":"10.1016/j.compmedimag.2024.102470","url":null,"abstract":"<div><div>Advancements in medical imaging and endovascular grafting have facilitated minimally invasive treatments for aortic diseases. Accurate 3D segmentation of the aorta and its branches is crucial for interventions, as inaccurate segmentation can lead to erroneous surgical planning and endograft construction. Previous methods simplified aortic segmentation as a binary image segmentation problem, overlooking the necessity of distinguishing between individual aortic branches. In this paper, we introduce Context-Infused Swin-UNet (CIS-UNet), a deep learning model designed for multi-class segmentation of the aorta and thirteen aortic branches. Combining the strengths of Convolutional Neural Networks (CNNs) and Swin transformers, CIS-UNet adopts a hierarchical encoder–decoder structure comprising a CNN encoder, a symmetric decoder, skip connections, and a novel Context-aware Shifted Window Self-Attention (CSW-SA) module as the bottleneck block. Notably, CSW-SA introduces a unique adaptation of the patch merging layer, distinct from its traditional use in the Swin transformers. CSW-SA efficiently condenses the feature map, providing a global spatial context, and enhances performance when applied at the bottleneck layer, offering superior computational efficiency and segmentation accuracy compared to the Swin transformers. We evaluated our model on computed tomography (CT) scans from 59 patients through a 4-fold cross-validation. CIS-UNet outperformed the state-of-the-art Swin UNetR segmentation model by achieving a superior mean Dice coefficient of 0.732 compared to 0.717 and a mean surface distance of 2.40 mm compared to 2.75 mm. CIS-UNet’s superior 3D aortic segmentation offers improved accuracy and optimization for planning endovascular treatments. Our dataset and code will be made publicly available at <span><span>https://github.com/mirthAI/CIS-UNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102470"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142696048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qingbin Wang , Yuxuan Xiong , Hanfeng Zhu , Xuefeng Mu , Yan Zhang , Yutao Ma
{"title":"Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer","authors":"Qingbin Wang , Yuxuan Xiong , Hanfeng Zhu , Xuefeng Mu , Yan Zhang , Yutao Ma","doi":"10.1016/j.compmedimag.2024.102469","DOIUrl":"10.1016/j.compmedimag.2024.102469","url":null,"abstract":"<div><h3>Background and Objective:</h3><div>Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images.</div></div><div><h3>Methods:</h3><div>In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder’s feature extraction task from the decoder’s reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data.</div></div><div><h3>Results:</h3><div>We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross-validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions.</div></div><div><h3>Conclusion:</h3><div>Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"118 ","pages":"Article 102469"},"PeriodicalIF":5.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142693803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}