Shengzhou Zhong , Wenxu Wang , Qianjin Feng , Yu Zhang , Zhenyuan Ning
{"title":"Cross-view discrepancy-dependency network for volumetric medical image segmentation","authors":"Shengzhou Zhong , Wenxu Wang , Qianjin Feng , Yu Zhang , Zhenyuan Ning","doi":"10.1016/j.media.2024.103329","DOIUrl":"10.1016/j.media.2024.103329","url":null,"abstract":"<div><p>The limited data poses a crucial challenge for deep learning-based volumetric medical image segmentation, and many methods have tried to represent the volume by its subvolumes (<em>i.e.</em>, multi-view slices) for alleviating this issue. However, such methods generally sacrifice inter-slice spatial continuity. Currently, a promising avenue involves incorporating multi-view information into the network to enhance volume representation learning, but most existing studies tend to overlook the discrepancy and dependency across different views, ultimately limiting the potential of multi-view representations. To this end, we propose a cross-view discrepancy-dependency network (CvDd-Net) to task with volumetric medical image segmentation, which exploits multi-view slice prior to assist volume representation learning and explore view discrepancy and view dependency for performance improvement. Specifically, we develop a discrepancy-aware morphology reinforcement (DaMR) module to effectively learn view-specific representation by mining morphological information (<em>i.e.</em>, boundary and position of object). Besides, we design a dependency-aware information aggregation (DaIA) module to adequately harness the multi-view slice prior, enhancing individual view representations of the volume and integrating them based on cross-view dependency. Extensive experiments on four medical image datasets (<em>i.e.</em>, Thyroid, Cervix, Pancreas, and Glioma) demonstrate the efficacy of the proposed method on both fully-supervised and semi-supervised tasks.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103329"},"PeriodicalIF":10.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142136483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lu Zhang , Zhengwang Wu , Xiaowei Yu , Yanjun Lyu , Zihao Wu , Haixing Dai , Lin Zhao , Li Wang , Gang Li , Xianqiao Wang , Tianming Liu , Dajiang Zhu
{"title":"Learning lifespan brain anatomical correspondence via cortical developmental continuity transfer","authors":"Lu Zhang , Zhengwang Wu , Xiaowei Yu , Yanjun Lyu , Zihao Wu , Haixing Dai , Lin Zhao , Li Wang , Gang Li , Xianqiao Wang , Tianming Liu , Dajiang Zhu","doi":"10.1016/j.media.2024.103328","DOIUrl":"10.1016/j.media.2024.103328","url":null,"abstract":"<div><p>Identifying anatomical correspondences in the human brain throughout the lifespan is an essential prerequisite for studying brain development and aging. But given the tremendous individual variability in cortical folding patterns, the heterogeneity of different neurodevelopmental stages, and the scarce of neuroimaging data, it is difficult to infer reliable lifespan anatomical correspondence at finer scales. To solve this problem, in this work, we take the advantage of the developmental continuity of the cerebral cortex and propose a novel transfer learning strategy: the model is trained from scratch using the age group with the largest sample size, and then is transferred and adapted to the other groups following the cortical developmental trajectory. A novel loss function is designed to ensure that during the transfer process the common patterns will be extracted and preserved, while the group-specific new patterns will be captured. The proposed framework was evaluated using multiple datasets covering four lifespan age groups with 1,000+ brains (from 34 gestational weeks to young adult). Our experimental results show that: 1) the proposed transfer strategy can dramatically improve the model performance on populations (e.g., early neurodevelopment) with very limited number of training samples; and 2) with the transfer learning we are able to robustly infer the complicated many-to-many anatomical correspondences among different brains at different neurodevelopmental stages. (Code will be released soon: <span><span>https://github.com/qidianzl/CDC-transfer</span><svg><path></path></svg></span>).</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103328"},"PeriodicalIF":10.7,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaiyan Li , Jingyuan Yang , Wenxuan Liang , Xingde Li , Chenxi Zhang , Lulu Chen , Chan Wu , Xiao Zhang , Zhiyan Xu , Yueling Wang , Lihui Meng , Yue Zhang , Youxin Chen , S. Kevin Zhou
{"title":"O-PRESS: Boosting OCT axial resolution with Prior guidance, Recurrence, and Equivariant Self-Supervision","authors":"Kaiyan Li , Jingyuan Yang , Wenxuan Liang , Xingde Li , Chenxi Zhang , Lulu Chen , Chan Wu , Xiao Zhang , Zhiyan Xu , Yueling Wang , Lihui Meng , Yue Zhang , Youxin Chen , S. Kevin Zhou","doi":"10.1016/j.media.2024.103319","DOIUrl":"10.1016/j.media.2024.103319","url":null,"abstract":"<div><p>Optical coherence tomography (OCT) is a noninvasive technology that enables real-time imaging of tissue microanatomies. The axial resolution of OCT is intrinsically constrained by the spectral bandwidth of the employed light source while maintaining a fixed center wavelength for a specific application. Physically extending this bandwidth faces strong limitations and requires a substantial cost. We present a novel computational approach, called as <strong>O-PRESS</strong>, for boosting the axial resolution of <strong>O</strong>CT with <strong>P</strong>rior guidance, a <strong>R</strong>ecurrent mechanism, and <strong>E</strong>quivariant <strong>S</strong>elf-<strong>S</strong>upervision. Diverging from conventional deconvolution methods that rely on physical models or data-driven techniques, our method seamlessly integrates OCT modeling and deep learning, enabling us to achieve real-time axial-resolution enhancement exclusively from measurements without a need for paired images. Our approach solves two primary tasks of resolution enhancement and noise reduction with one treatment. Both tasks are executed in a self-supervised manner, with equivariance imaging and free space priors guiding their respective processes. Experimental evaluations, encompassing both quantitative metrics and visual assessments, consistently verify the efficacy and superiority of our approach, which exhibits performance on par with fully supervised methods. Importantly, the robustness of our model is affirmed, showcasing its dual capability to enhance axial resolution while concurrently improving the signal-to-noise ratio.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103319"},"PeriodicalIF":10.7,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yufei Tang , Tianling Lyu , Haoyang Jin , Qiang Du , Jiping Wang , Yunxiang Li , Ming Li , Yang Chen , Jian Zheng
{"title":"Domain adaptive noise reduction with iterative knowledge transfer and style generalization learning","authors":"Yufei Tang , Tianling Lyu , Haoyang Jin , Qiang Du , Jiping Wang , Yunxiang Li , Ming Li , Yang Chen , Jian Zheng","doi":"10.1016/j.media.2024.103327","DOIUrl":"10.1016/j.media.2024.103327","url":null,"abstract":"<div><p>Low-dose computed tomography (LDCT) denoising tasks face significant challenges in practical imaging scenarios. Supervised methods encounter difficulties in real-world scenarios as there are no paired data for training. Moreover, when applied to datasets with varying noise patterns, these methods may experience decreased performance owing to the domain gap. Conversely, unsupervised methods do not require paired data and can be directly trained on real-world data. However, they often exhibit inferior performance compared to supervised methods. To address this issue, it is necessary to leverage the strengths of these supervised and unsupervised methods. In this paper, we propose a novel domain adaptive noise reduction framework (DANRF), which integrates both knowledge transfer and style generalization learning to effectively tackle the domain gap problem. Specifically, an iterative knowledge transfer method with knowledge distillation is selected to train the target model using unlabeled target data and a pre-trained source model trained with paired simulation data. Meanwhile, we introduce the mean teacher mechanism to update the source model, enabling it to adapt to the target domain. Furthermore, an iterative style generalization learning process is also designed to enrich the style diversity of the training dataset. We evaluate the performance of our approach through experiments conducted on multi-source datasets. The results demonstrate the feasibility and effectiveness of our proposed DANRF model in multi-source LDCT image processing tasks. Given its hybrid nature, which combines the advantages of supervised and unsupervised learning, and its ability to bridge domain gaps, our approach is well-suited for improving practical low-dose CT imaging in clinical settings. Code for our proposed approach is publicly available at <span><span>https://github.com/tyfeiii/DANRF</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103327"},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142076894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiange Liu , Qingze Bai , Drew A. Torigian , Yubing Tong , Jayaram K. Udupa
{"title":"VSmTrans: A hybrid paradigm integrating self-attention and convolution for 3D medical image segmentation","authors":"Tiange Liu , Qingze Bai , Drew A. Torigian , Yubing Tong , Jayaram K. Udupa","doi":"10.1016/j.media.2024.103295","DOIUrl":"10.1016/j.media.2024.103295","url":null,"abstract":"<div><h3>Purpose</h3><p>Vision Transformers recently achieved a competitive performance compared with CNNs due to their excellent capability of learning global representation. However, there are two major challenges when applying them to 3D image segmentation: i) Because of the large size of 3D medical images, comprehensive global information is hard to capture due to the enormous computational costs. ii) Insufficient local inductive bias in Transformers affects the ability to segment detailed features such as ambiguous and subtly defined boundaries. Hence, to apply the Vision Transformer mechanism in the medical image segmentation field, the above challenges need to be overcome adequately.</p></div><div><h3>Methods</h3><p>We propose a hybrid paradigm, called Variable-Shape Mixed Transformer (VSmTrans), that integrates self-attention and convolution and can enjoy the benefits of free learning of both complex relationships from the self-attention mechanism and the local prior knowledge from convolution. Specifically, we designed a Variable-Shape self-attention mechanism, which can rapidly expand the receptive field without extra computing cost and achieve a good trade-off between global awareness and local details. In addition, the parallel convolution paradigm introduces strong local inductive bias to facilitate the ability to excavate details. Meanwhile, a pair of learnable parameters can automatically adjust the importance of the above two paradigms. Extensive experiments were conducted on two public medical image datasets with different modalities: the AMOS CT dataset and the BraTS2021 MRI dataset.</p></div><div><h3>Results</h3><p>Our method achieves the best average Dice scores of 88.3 % and 89.7 % on these datasets, which are superior to the previous state-of-the-art Swin Transformer-based and CNN-based architectures. A series of ablation experiments were also conducted to verify the efficiency of the proposed hybrid mechanism and the components and explore the effectiveness of those key parameters in VSmTrans.</p></div><div><h3>Conclusions</h3><p>The proposed hybrid Transformer-based backbone network for 3D medical image segmentation can tightly integrate self-attention and convolution to exploit the advantages of these two paradigms. The experimental results demonstrate our method's superiority compared to other state-of-the-art methods. The hybrid paradigm seems to be most appropriate to the medical image segmentation field. The ablation experiments also demonstrate that the proposed hybrid mechanism can effectively balance large receptive fields with local inductive biases, resulting in highly accurate segmentation results, especially in capturing details. Our code is available at https://github.com/qingze-bai/VSmTrans.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103295"},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wei Peng , Tomas Bosschieter , Jiahong Ouyang , Robert Paul , Edith V. Sullivan , Adolf Pfefferbaum , Ehsan Adeli , Qingyu Zhao , Kilian M. Pohl
{"title":"Metadata-conditioned generative models to synthesize anatomically-plausible 3D brain MRIs","authors":"Wei Peng , Tomas Bosschieter , Jiahong Ouyang , Robert Paul , Edith V. Sullivan , Adolf Pfefferbaum , Ehsan Adeli , Qingyu Zhao , Kilian M. Pohl","doi":"10.1016/j.media.2024.103325","DOIUrl":"10.1016/j.media.2024.103325","url":null,"abstract":"<div><p>Recent advances in generative models have paved the way for enhanced generation of natural and medical images, including synthetic brain MRIs. However, the mainstay of current AI research focuses on optimizing synthetic MRIs with respect to visual quality (such as signal-to-noise ratio) while lacking insights into their relevance to neuroscience. To generate high-quality T1-weighted MRIs relevant for neuroscience discovery, we present a two-stage Diffusion Probabilistic Model (called BrainSynth) to synthesize high-resolution MRIs conditionally-dependent on metadata (such as age and sex). We then propose a novel procedure to assess the quality of BrainSynth according to how well its synthetic MRIs capture macrostructural properties of brain regions and how accurately they encode the effects of age and sex. Results indicate that more than half of the brain regions in our synthetic MRIs are anatomically plausible, i.e., the effect size between real and synthetic MRIs is small relative to biological factors such as age and sex. Moreover, the anatomical plausibility varies across cortical regions according to their geometric complexity. As is, the MRIs generated by BrainSynth significantly improve the training of a predictive model to identify accelerated aging effects in an independent study. These results indicate that our model accurately capture the brain’s anatomical information and thus could enrich the data of underrepresented samples in a study. The code of BrainSynth will be released as part of the MONAI project at <span><span>https://github.com/Project-MONAI/GenerativeModels</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103325"},"PeriodicalIF":10.7,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Selena Wang , Yiting Wang , Frederick H. Xu , Li Shen , Yize Zhao , Alzheimer’s Disease Neuroimaging Initiative
{"title":"Establishing group-level brain structural connectivity incorporating anatomical knowledge under latent space modeling","authors":"Selena Wang , Yiting Wang , Frederick H. Xu , Li Shen , Yize Zhao , Alzheimer’s Disease Neuroimaging Initiative","doi":"10.1016/j.media.2024.103309","DOIUrl":"10.1016/j.media.2024.103309","url":null,"abstract":"<div><p>Brain structural connectivity, capturing the white matter fiber tracts among brain regions inferred by diffusion MRI (dMRI), provides a unique characterization of brain anatomical organization. One fundamental question to address with structural connectivity is how to properly summarize and perform statistical inference for a group-level connectivity architecture, for instance, under different sex groups, or disease cohorts. Existing analyses commonly summarize group-level brain connectivity by a simple entry-wise sample mean or median across individual brain connectivity matrices. However, such a heuristic approach fully ignores the associations among structural connections and the topological properties of brain networks. In this project, we propose a latent space-based generative network model to estimate group-level brain connectivity. Within our modeling framework, we incorporate the anatomical information of brain regions as the attributes of nodes to enhance the plausibility of our estimation and improve biological interpretation. We name our method the attributes-informed brain connectivity (ABC) model, which compared with existing group-level connectivity estimations, (1) offers an interpretable latent space representation of the group-level connectivity, (2) incorporates the anatomical knowledge of nodes and tests its co-varying relationship with connectivity and (3) quantifies the uncertainty and evaluates the likelihood of the estimated group-level effects against chance. We devise a novel Bayesian MCMC algorithm to estimate the model. We evaluate the performance of our model through extensive simulations. By applying the ABC model to study brain structural connectivity stratified by sex among Alzheimer’s Disease (AD) subjects and healthy controls incorporating the anatomical attributes (volume, thickness and area) on nodes, our method shows superior predictive power on out-of-sample structural connectivity and identifies meaningful sex-specific network neuromarkers for AD.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103309"},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142145953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuhan Li , Dong Zhang , Xiaomeng Li , Chubin Ou , Lin An , Yanwu Xu , Weihua Yang , Yanchun Zhang , Kwang-Ting Cheng
{"title":"Vessel-promoted OCT to OCTA image translation by heuristic contextual constraints","authors":"Shuhan Li , Dong Zhang , Xiaomeng Li , Chubin Ou , Lin An , Yanwu Xu , Weihua Yang , Yanchun Zhang , Kwang-Ting Cheng","doi":"10.1016/j.media.2024.103311","DOIUrl":"10.1016/j.media.2024.103311","url":null,"abstract":"<div><p>Optical Coherence Tomography Angiography (OCTA) is a crucial tool in the clinical screening of retinal diseases, allowing for accurate 3D imaging of blood vessels through non-invasive scanning. However, the hardware-based approach for acquiring OCTA images presents challenges due to the need for specialized sensors and expensive devices. In this paper, we introduce a novel method called TransPro, which can translate the readily available 3D Optical Coherence Tomography (OCT) images into 3D OCTA images without requiring any additional hardware modifications. Our TransPro method is primarily driven by two novel ideas that have been overlooked by prior work. The first idea is derived from a critical observation that the OCTA projection map is generated by averaging pixel values from its corresponding B-scans along the Z-axis. Hence, we introduce a hybrid architecture incorporating a 3D adversarial generative network and a novel <strong>H</strong>euristic <strong>C</strong>ontextual <strong>G</strong>uidance <strong>(HCG)</strong> module, which effectively maintains the consistency of the generated OCTA images between 3D volumes and projection maps. The second idea is to improve the vessel quality in the translated OCTA projection maps. As a result, we propose a novel <strong>V</strong>essel <strong>P</strong>romoted <strong>G</strong>uidance <strong>(VPG)</strong> module to enhance the attention of network on retinal vessels. Experimental results on two datasets demonstrate that our TransPro outperforms state-of-the-art approaches, with relative improvements around 11.4% in MAE, 2.7% in PSNR, 2% in SSIM, 40% in VDE, and 9.1% in VDC compared to the baseline method. The code is available at: <span><span>https://github.com/ustlsh/TransPro</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103311"},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"3DSAM-adapter: Holistic adaptation of SAM from 2D to 3D for promptable tumor segmentation","authors":"Shizhan Gong, Yuan Zhong, Wenao Ma, Jinpeng Li, Zhao Wang, Jingyang Zhang, Pheng-Ann Heng, Qi Dou","doi":"10.1016/j.media.2024.103324","DOIUrl":"10.1016/j.media.2024.103324","url":null,"abstract":"<div><p>Despite that the segment anything model (SAM) achieved impressive results on general-purpose semantic segmentation with strong generalization ability on daily images, its demonstrated performance on medical image segmentation is less precise and unstable, especially when dealing with tumor segmentation tasks that involve objects of small sizes, irregular shapes, and low contrast. Notably, the original SAM architecture is designed for 2D natural images and, therefore would not be able to extract the 3D spatial information from volumetric medical data effectively. In this paper, we propose a novel adaptation method for transferring SAM from 2D to 3D for promptable medical image segmentation. Through a holistically designed scheme for architecture modification, we transfer the SAM to support volumetric inputs while retaining the majority of its pre-trained parameters for reuse. The fine-tuning process is conducted in a parameter-efficient manner, wherein most of the pre-trained parameters remain frozen, and only a few lightweight spatial adapters are introduced and tuned. Regardless of the domain gap between natural and medical data and the disparity in the spatial arrangement between 2D and 3D, the transformer trained on natural images can effectively capture the spatial patterns present in volumetric medical images with only lightweight adaptations. We conduct experiments on four open-source tumor segmentation datasets, and with a single click prompt, our model can outperform domain state-of-the-art medical image segmentation models and interactive segmentation models. We also compared our adaptation method with existing popular adapters and observed significant performance improvement on most datasets. Our code and models are available at: <span><span>https://github.com/med-air/3DSAM-adapter</span><svg><path></path></svg></span></p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103324"},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142097787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Chen , Xuejiao Li , Heye Zhang , Yongwon Cho , Sung Ho Hwang , Zhifan Gao , Guang Yang
{"title":"Adaptive dynamic inference for few-shot left atrium segmentation","authors":"Jun Chen , Xuejiao Li , Heye Zhang , Yongwon Cho , Sung Ho Hwang , Zhifan Gao , Guang Yang","doi":"10.1016/j.media.2024.103321","DOIUrl":"10.1016/j.media.2024.103321","url":null,"abstract":"<div><p>Accurate segmentation of the left atrium (LA) from late gadolinium-enhanced cardiac magnetic resonance (LGE CMR) images is crucial for aiding the treatment of patients with atrial fibrillation. Few-shot learning holds significant potential for achieving accurate LA segmentation with low demand on high-cost labeled LGE CMR data and fast generalization across different centers. However, accurate LA segmentation with few-shot learning is a challenging task due to the low-intensity contrast between the LA and other neighboring organs in LGE CMR images. To address this issue, we propose an Adaptive Dynamic Inference Network (ADINet) that explicitly models the differences between the foreground and background. Specifically, ADINet leverages dynamic collaborative inference (DCI) and dynamic reverse inference (DRI) to adaptively allocate semantic-aware and spatial-specific convolution weights and indication information. These allocations are conditioned on the support foreground and background knowledge, utilizing pixel-wise correlations, for different spatial positions of query images. The convolution weights adapt to different visual patterns based on spatial positions, enabling effective encoding of differences between foreground and background regions. Meanwhile, the indication information adapts to the background visual pattern to reversely decode foreground LA regions, leveraging their spatial complementarity. To promote the learning of ADINet, we propose hierarchical supervision, which enforces spatial consistency and differences between the background and foreground regions through pixel-wise semantic supervision and pixel-pixel correlation supervision. We demonstrated the performance of ADINet on three LGE CMR datasets from different centers. Compared to state-of-the-art methods with ten available samples, ADINet yielded better segmentation performance in terms of four metrics.</p></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"98 ","pages":"Article 103321"},"PeriodicalIF":10.7,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142083434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}