{"title":"CASHNet: Context-Aware Semantics-driven Hierarchical Network for Hybrid Diffeomorphic CT-CBCT Image Registration.","authors":"Xiaoru Gao,Housheng Xie,Donghua Hang,Guoyan Zheng","doi":"10.1109/tmi.2025.3607700","DOIUrl":null,"url":null,"abstract":"Computed Tomography (CT) to Cone-Beam Computed Tomography (CBCT) image registration is crucial for image-guided radiotherapy and surgical procedures. However, achieving accurate CT-CBCT registration remains challenging due to various factors such as inconsistent intensities, low contrast resolution and imaging artifacts. In this study, we propose a Context-Aware Semantics-driven Hierarchical Network (referred to as CASHNet), which hierarchically integrates context-aware semantics-encoded features into a coarse-to-fine registration scheme, to explicitly enhance semantic structural perception during progressive alignment. Moreover, it leverages diffeomorphisms to integrate rigid and non-rigid registration within a single end-to-end trainable network, enabling anatomically plausible deformations and preserving topological consistency. CASHNet comprises a Siamese Mamba-based multi-scale feature encoder and a coarse-to-fine registration decoder, which integrates a Rigid Registration (RR) module with multiple Semantics-guided Velocity Estimation and Feature Alignment (SVEFA) modules operating at different resolutions. Each SVEFA module comprises three carefully designed components: i) a cross-resolution feature aggregation (CFA) component that synthesizes enhanced global contextual representations, ii) a semantics perception and encoding (SPE) component that captures and encodes local semantic information, and iii) an incremental velocity estimation and feature alignment (IVEFA) component that leverages contextual and semantic features to update velocity fields and to align features. These modules work synergistically to boost the overall registration performance. Extensive experiments on three typical yet challenging CT-CBCT datasets of both soft and hard tissues demonstrate the superiority of our proposed method over other state-of-the-art methods. The code will be publicly available at https://github.com/xiaorugao999/CASHNet.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"14 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3607700","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Computed Tomography (CT) to Cone-Beam Computed Tomography (CBCT) image registration is crucial for image-guided radiotherapy and surgical procedures. However, achieving accurate CT-CBCT registration remains challenging due to various factors such as inconsistent intensities, low contrast resolution and imaging artifacts. In this study, we propose a Context-Aware Semantics-driven Hierarchical Network (referred to as CASHNet), which hierarchically integrates context-aware semantics-encoded features into a coarse-to-fine registration scheme, to explicitly enhance semantic structural perception during progressive alignment. Moreover, it leverages diffeomorphisms to integrate rigid and non-rigid registration within a single end-to-end trainable network, enabling anatomically plausible deformations and preserving topological consistency. CASHNet comprises a Siamese Mamba-based multi-scale feature encoder and a coarse-to-fine registration decoder, which integrates a Rigid Registration (RR) module with multiple Semantics-guided Velocity Estimation and Feature Alignment (SVEFA) modules operating at different resolutions. Each SVEFA module comprises three carefully designed components: i) a cross-resolution feature aggregation (CFA) component that synthesizes enhanced global contextual representations, ii) a semantics perception and encoding (SPE) component that captures and encodes local semantic information, and iii) an incremental velocity estimation and feature alignment (IVEFA) component that leverages contextual and semantic features to update velocity fields and to align features. These modules work synergistically to boost the overall registration performance. Extensive experiments on three typical yet challenging CT-CBCT datasets of both soft and hard tissues demonstrate the superiority of our proposed method over other state-of-the-art methods. The code will be publicly available at https://github.com/xiaorugao999/CASHNet.
期刊介绍:
The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy.
T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods.
While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.