The registration of intraoperative x-ray images with preoperative CT images is an important step in image-guided surgery. However, existing regression-based methods lack an interpretable and stable mechanism when fusing information from intraoperative images and preoperative CT volumes. In addition, existing feature extraction and fusion methods limit the accuracy of pose regression.
The objective of this study is to develop a method that leverages both x-ray and computed tomography (CT) images to rapidly and robustly estimate an accurate initial registration within a broad search space. This approach integrates the strengths of learning-based registration with those of traditional registration methodologies, enabling the acquisition of registration outcomes across a wide search space at an accelerated pace.
We introduce a regression-based registration framework to address the aforementioned issues. We constrain the feature fusion process by training the network to reconstruct the high-dimensional feature representation vector of the preoperative CT volume in the embedding space from the input single-view x-ray, thereby enhancing the interpretability of feature extraction. Also, in order to promote the effective fusion and better extraction of local texture features and global information, we propose a lightweight cross-attention mechanism named lite cross-attention(LCAT). Besides, to meet the intraoperative requirements, we employ the intensity-based registration method CMA-ES to refine the result of pose regression.
Our approach is verified on both real and simulated x-ray data. Experimental results show that compared with the existing learning-based registration methods, the median rotation error of our method can reach 1.9 and the median translation error can reach 5.6 mm in the case of a large search range. When evaluated on 52 real x-ray images, we have a median rotation error of 1.6 and a median translation error of 3.8 mm due to the smaller search range. We also verify the role of the LCAT and embedding reconstruction modules in our registration framework. If they are not used, our registration performance will be reduced to approximately random initialization results.
During the experiments, our method demonstrates higher accuracy and larger capture range on both simulated images and real x-ray images compared to existing methods. The inspiring experimental results indicate the potential for future clinical application of our method.