{"title":"Knowledge-driven multi-graph convolutional network for brain network analysis and potential biomarker discovery","authors":"","doi":"10.1016/j.media.2024.103368","DOIUrl":"10.1016/j.media.2024.103368","url":null,"abstract":"<div><div>In brain network analysis, individual-level data can provide biological features of individuals, while population-level data can provide demographic information of populations. However, existing methods mostly utilize either individual- or population-level features separately, inevitably neglecting the multi-level characteristics of brain disorders. To address this issue, we propose an end-to-end multi-graph neural network model called KMGCN. This model simultaneously leverages individual- and population-level features for brain network analysis. At the individual level, we construct multi-graph using both knowledge-driven and data-driven approaches. Knowledge-driven refers to constructing a knowledge graph based on prior knowledge, while data-driven involves learning a data graph from the data itself. At the population level, we construct multi-graph using both imaging and phenotypic data. Additionally, we devise a pooling method tailored for brain networks, capable of selecting brain regions that impact brain disorders. We evaluate the performance of our model on two large datasets, ADNI and ABIDE, and experimental results demonstrate that it achieves state-of-the-art performance, with 86.87% classification accuracy for ADNI and 86.40% for ABIDE, accompanied by around 10% improvements in all evaluation metrics compared to the state-of-the-art models. Additionally, the biomarkers identified by our model align well with recent neuroscience research, indicating the effectiveness of our model in brain network analysis and potential biomarker discovery. The code is available at <span><span>https://github.com/GN-gjh/KMGCN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RFMiD: Retinal Image Analysis for multi-Disease Detection challenge","authors":"","doi":"10.1016/j.media.2024.103365","DOIUrl":"10.1016/j.media.2024.103365","url":null,"abstract":"<div><div>In the last decades, many publicly available large fundus image datasets have been collected for diabetic retinopathy, glaucoma, and age-related macular degeneration, and a few other frequent pathologies. These publicly available datasets were used to develop a computer-aided disease diagnosis system by training deep learning models to detect these frequent pathologies. One challenge limiting the adoption of a such system by the ophthalmologist is, computer-aided disease diagnosis system ignores sight-threatening rare pathologies such as central retinal artery occlusion or anterior ischemic optic neuropathy and others that ophthalmologists currently detect. Aiming to advance the state-of-the-art in automatic ocular disease classification of frequent diseases along with the rare pathologies, a grand challenge on “Retinal Image Analysis for multi-Disease Detection” was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI - 2021). This paper, reports the challenge organization, dataset, top-performing participants solutions, evaluation measures, and results based on a new “Retinal Fundus Multi-disease Image Dataset” (RFMiD). There were two principal sub-challenges: disease screening (i.e. presence versus absence of pathology — a binary classification problem) and disease/pathology classification (a 28-class multi-label classification problem). It received a positive response from the scientific community with 74 submissions by individuals/teams that effectively entered in this challenge. The top-performing methodologies utilized a blend of data-preprocessing, data augmentation, pre-trained model, and model ensembling. This multi-disease (frequent and rare pathologies) detection will enable the development of generalizable models for screening the retina, unlike the previous efforts that focused on the detection of specific diseases.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual structure-aware image filterings for semi-supervised medical image segmentation","authors":"","doi":"10.1016/j.media.2024.103364","DOIUrl":"10.1016/j.media.2024.103364","url":null,"abstract":"<div><div>Semi-supervised image segmentation has attracted great attention recently. The key is how to leverage unlabeled images in the training process. Most methods maintain consistent predictions of the unlabeled images under variations (<em>e.g.</em>, adding noise/perturbations, or creating alternative versions) in the image and/or model level. In most image-level variation, medical images often have prior structure information, which has not been well explored. In this paper, we propose novel dual structure-aware image filterings (DSAIF) as the image-level variations for semi-supervised medical image segmentation. Motivated by connected filtering that simplifies image via filtering in structure-aware tree-based image representation, we resort to the dual contrast invariant Max-tree and Min-tree representation. Specifically, we propose a novel connected filtering that removes topologically equivalent nodes (<em>i.e.</em> connected components) having no siblings in the Max/Min-tree. This results in two filtered images preserving topologically critical structure. Applying the proposed DSAIF to mutually supervised networks decreases the consensus of their erroneous predictions on unlabeled images. This helps to alleviate the confirmation bias issue of overfitting to noisy pseudo labels of unlabeled images, and thus effectively improves the segmentation performance. Extensive experimental results on three benchmark datasets demonstrate that the proposed method significantly/consistently outperforms some state-of-the-art methods. The source codes will be publicly available.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Texture-preserving diffusion model for CBCT-to-CT synthesis","authors":"","doi":"10.1016/j.media.2024.103362","DOIUrl":"10.1016/j.media.2024.103362","url":null,"abstract":"<div><div>Cone beam computed tomography (CBCT) serves as a vital imaging modality in diverse clinical applications, but is constrained by inherent limitations such as reduced image quality and increased noise. In contrast, computed tomography (CT) offers superior resolution and tissue contrast. Bridging the gap between these modalities through CBCT-to-CT synthesis becomes imperative. Deep learning techniques have enhanced this synthesis, yet challenges with generative adversarial networks persist. Denoising Diffusion Probabilistic Models have emerged as a promising alternative in image synthesis. In this study, we propose a novel texture-preserving diffusion model for CBCT-to-CT synthesis that incorporates adaptive high-frequency optimization and a dual-mode feature fusion module. Our method aims to enhance high-frequency details, effectively fuse cross-modality features, and preserve fine image structures. Extensive validation demonstrates superior performance over existing methods, showcasing better generalization. The proposed model offers a transformative pathway to augment diagnostic accuracy and refine treatment planning across various clinical settings. This work represents a pivotal step toward non-invasive, safer, and high-quality CBCT-to-CT synthesis, advancing personalized diagnostic imaging practices.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142406521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthias Ivantsits, Leonid Goubergrits, Jan-Martin Kuhnigk, Markus Huellebrand, Jan Bruening, Tabea Kossen, Boris Pfahringer, Jens Schaller, Andreas Spuler, Titus Kuehne, Yizhuan Jia, Xuesong Li, Suprosanna Shit, Bjoern Menze, Ziyu Su, Jun Ma, Ziwei Nie, Kartik Jain, Yanfei Liu, Yi Lin, Anja Hennemuth
{"title":"Corrigendum to \"Detection and analysis of cerebral aneurysms based on X-ray rotational angiography - the CADA 2020 challenge\" [Medical Image Analysis, April 2022, Volume 77, 102333].","authors":"Matthias Ivantsits, Leonid Goubergrits, Jan-Martin Kuhnigk, Markus Huellebrand, Jan Bruening, Tabea Kossen, Boris Pfahringer, Jens Schaller, Andreas Spuler, Titus Kuehne, Yizhuan Jia, Xuesong Li, Suprosanna Shit, Bjoern Menze, Ziyu Su, Jun Ma, Ziwei Nie, Kartik Jain, Yanfei Liu, Yi Lin, Anja Hennemuth","doi":"10.1016/j.media.2024.103363","DOIUrl":"https://doi.org/10.1016/j.media.2024.103363","url":null,"abstract":"","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142400702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient anatomical labeling of pulmonary tree structures via deep point-graph representation-based implicit fields","authors":"","doi":"10.1016/j.media.2024.103367","DOIUrl":"10.1016/j.media.2024.103367","url":null,"abstract":"<div><div>Pulmonary diseases rank prominently among the principal causes of death worldwide. Curing them will require, among other things, a better understanding of the complex 3D tree-shaped structures within the pulmonary system, such as airways, arteries, and veins. Traditional approaches using high-resolution image stacks and standard CNNs on dense voxel grids face challenges in computational efficiency, limited resolution, local context, and inadequate preservation of shape topology. Our method addresses these issues by shifting from dense voxel to sparse point representation, offering better memory efficiency and global context utilization. However, the inherent sparsity in point representation can lead to a loss of crucial connectivity in tree-shaped structures. To mitigate this, we introduce graph learning on skeletonized structures, incorporating differentiable feature fusion for improved topology and long-distance context capture. Furthermore, we employ an implicit function for efficient conversion of sparse representations into dense reconstructions end-to-end. The proposed method not only delivers state-of-the-art performance in labeling accuracy, both overall and at key locations, but also enables efficient inference and the generation of closed surface shapes. Addressing data scarcity in this field, we have also curated a comprehensive dataset to validate our approach. Data and code are available at <span><span>https://github.com/M3DV/pulmonary-tree-labeling</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142503532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A survey on cell nuclei instance segmentation and classification: Leveraging context and attention","authors":"","doi":"10.1016/j.media.2024.103360","DOIUrl":"10.1016/j.media.2024.103360","url":null,"abstract":"<div><div>Nuclear-derived morphological features and biomarkers provide relevant insights regarding the tumour microenvironment, while also allowing diagnosis and prognosis in specific cancer types. However, manually annotating nuclei from the gigapixel Haematoxylin and Eosin (H&E)-stained Whole Slide Images (WSIs) is a laborious and costly task, meaning automated algorithms for cell nuclei instance segmentation and classification could alleviate the workload of pathologists and clinical researchers and at the same time facilitate the automatic extraction of clinically interpretable features for artificial intelligence (AI) tools. But due to high intra- and inter-class variability of nuclei morphological and chromatic features, as well as H&E-stains susceptibility to artefacts, state-of-the-art algorithms cannot correctly detect and classify instances with the necessary performance. In this work, we hypothesize context and attention inductive biases in artificial neural networks (ANNs) could increase the performance and generalization of algorithms for cell nuclei instance segmentation and classification. To understand the advantages, use-cases, and limitations of context and attention-based mechanisms in instance segmentation and classification, we start by reviewing works in computer vision and medical imaging. We then conduct a thorough survey on context and attention methods for cell nuclei instance segmentation and classification from H&E-stained microscopy imaging, while providing a comprehensive discussion of the challenges being tackled with context and attention. Besides, we illustrate some limitations of current approaches and present ideas for future research. As a case study, we extend both a general (Mask-RCNN) and a customized (HoVer-Net) instance segmentation and classification methods with context- and attention-based mechanisms and perform a comparative analysis on a multicentre dataset for colon nuclei identification and counting.</div><div>Although pathologists rely on context at multiple levels while paying attention to specific Regions of Interest (RoIs) when analysing and annotating WSIs, our findings suggest translating that domain knowledge into algorithm design is no trivial task, but to fully exploit these mechanisms in ANNs, the scientific understanding of these methods should first be addressed.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142391704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LoViT: Long Video Transformer for surgical phase recognition","authors":"","doi":"10.1016/j.media.2024.103366","DOIUrl":"10.1016/j.media.2024.103366","url":null,"abstract":"<div><div>Online surgical phase recognition plays a significant role towards building contextual tools that could quantify performance and oversee the execution of surgical workflows. Current approaches are limited since they train spatial feature extractors using frame-level supervision that could lead to incorrect predictions due to similar frames appearing at different phases, and poorly fuse local and global features due to computational constraints which can affect the analysis of long videos commonly encountered in surgical interventions. In this paper, we present a two-stage method, called Long Video Transformer (LoViT), emphasizing the development of a temporally-rich spatial feature extractor and a phase transition map. The temporally-rich spatial feature extractor is designed to capture critical temporal information within the surgical video frames. The phase transition map provides essential insights into the dynamic transitions between different surgical phases. LoViT combines these innovations with a multiscale temporal aggregator consisting of two cascaded L-Trans modules based on self-attention, followed by a G-Informer module based on <em>ProbSparse</em> self-attention for processing global temporal information. The multi-scale temporal head then leverages the temporally-rich spatial features and phase transition map to classify surgical phases using phase transition-aware supervision. Our approach outperforms state-of-the-art methods on the Cholec80 and AutoLaparo datasets consistently. Compared to Trans-SVNet, LoViT achieves a 2.4 pp (percentage point) improvement in video-level accuracy on Cholec80 and a 3.1 pp improvement on AutoLaparo. Our results demonstrate the effectiveness of our approach in achieving state-of-the-art performance of surgical phase recognition on two datasets of different surgical procedures and temporal sequencing characteristics. The project page is available at <span><span>https://github.com/MRUIL/LoViT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Foundation Language-Image Model of the Retina (FLAIR): encoding expert knowledge in text supervision","authors":"","doi":"10.1016/j.media.2024.103357","DOIUrl":"10.1016/j.media.2024.103357","url":null,"abstract":"<div><div>Foundation vision-language models are currently transforming computer vision, and are on the rise in medical imaging fueled by their very promising generalization capabilities. However, the initial attempts to transfer this new paradigm to medical imaging have shown less impressive performances than those observed in other domains, due to the significant domain shift and the complex, expert domain knowledge inherent to medical-imaging tasks. Motivated by the need for domain-expert foundation models, we present FLAIR, a pre-trained vision-language model for universal retinal fundus image understanding. To this end, we compiled 38 open-access, mostly categorical fundus imaging datasets from various sources, with up to 101 different target conditions and 288,307 images. We integrate the expert’s domain knowledge in the form of descriptive textual prompts, during both pre-training and zero-shot inference, enhancing the less-informative categorical supervision of the data. Such a textual expert’s knowledge, which we compiled from the relevant clinical literature and community standards, describes the fine-grained features of the pathologies as well as the hierarchies and dependencies between them. We report comprehensive evaluations, which illustrate the benefit of integrating expert knowledge and the strong generalization capabilities of FLAIR under difficult scenarios with domain shifts or unseen categories. When adapted with a lightweight linear probe, FLAIR outperforms fully-trained, dataset-focused models, more so in the few-shot regimes. Interestingly, FLAIR outperforms by a wide margin larger-scale generalist image-language models and retina domain-specific self-supervised networks, which emphasizes the potential of embedding experts’ domain knowledge and the limitations of generalist models in medical imaging. The pre-trained model is available at: <span><span>https://github.com/jusiro/FLAIR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142442816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PViT-AIR: Puzzling vision transformer-based affine image registration for multi histopathology and faxitron images of breast tissue","authors":"","doi":"10.1016/j.media.2024.103356","DOIUrl":"10.1016/j.media.2024.103356","url":null,"abstract":"<div><div>Breast cancer is a significant global public health concern, with various treatment options available based on tumor characteristics. Pathological examination of excision specimens after surgery provides essential information for treatment decisions. However, the manual selection of representative sections for histological examination is laborious and subjective, leading to potential sampling errors and variability, especially in carcinomas that have been previously treated with chemotherapy. Furthermore, the accurate identification of residual tumors presents significant challenges, emphasizing the need for systematic or assisted methods to address this issue. In order to enable the development of deep-learning algorithms for automated cancer detection on radiology images, it is crucial to perform radiology-pathology registration, which ensures the generation of accurately labeled ground truth data. The alignment of radiology and histopathology images plays a critical role in establishing reliable cancer labels for training deep-learning algorithms on radiology images. However, aligning these images is challenging due to their content and resolution differences, tissue deformation, artifacts, and imprecise correspondence. We present a novel deep learning-based pipeline for the affine registration of faxitron images, the x-ray representations of macrosections of ex-vivo breast tissue, and their corresponding histopathology images of tissue segments. The proposed model combines convolutional neural networks and vision transformers, allowing it to effectively capture both local and global information from the entire tissue macrosection as well as its segments. This integrated approach enables simultaneous registration and stitching of image segments, facilitating segment-to-macrosection registration through a puzzling-based mechanism. To address the limitations of multi-modal ground truth data, we tackle the problem by training the model using synthetic mono-modal data in a weakly supervised manner. The trained model demonstrated successful performance in multi-modal registration, yielding registration results with an average landmark error of 1.51 mm <span><math><mrow><mo>(</mo><mo>±</mo><mn>2</mn><mo>.</mo><mn>40</mn><mo>)</mo></mrow></math></span>, and stitching distance of 1.15 mm <span><math><mrow><mo>(</mo><mo>±</mo><mn>0</mn><mo>.</mo><mn>94</mn><mo>)</mo></mrow></math></span>. The results indicate that the model performs significantly better than existing baselines, including both deep learning-based and iterative models, and it is also approximately 200 times faster than the iterative approach. This work bridges the gap in the current research and clinical workflow and has the potential to improve efficiency and accuracy in breast cancer evaluation and streamline pathology workflow.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142391706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}