Tonglong Li, Minheng Chen, Mingying Li, Chuanyou Li, Youyong Kong
{"title":"基于嵌入重建和低交叉关注的x线与CT自动配准。","authors":"Tonglong Li, Minheng Chen, Mingying Li, Chuanyou Li, Youyong Kong","doi":"10.1002/mp.17896","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3> \n <p>The registration of intraoperative x-ray images with preoperative CT images is an important step in image-guided surgery. However, existing regression-based methods lack an interpretable and stable mechanism when fusing information from intraoperative images and preoperative CT volumes. In addition, existing feature extraction and fusion methods limit the accuracy of pose regression.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3> \n <p>The objective of this study is to develop a method that leverages both x-ray and computed tomography (CT) images to rapidly and robustly estimate an accurate initial registration within a broad search space. This approach integrates the strengths of learning-based registration with those of traditional registration methodologies, enabling the acquisition of registration outcomes across a wide search space at an accelerated pace.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3> \n <p>We introduce a regression-based registration framework to address the aforementioned issues. We constrain the feature fusion process by training the network to reconstruct the high-dimensional feature representation vector of the preoperative CT volume in the embedding space from the input single-view x-ray, thereby enhancing the interpretability of feature extraction. Also, in order to promote the effective fusion and better extraction of local texture features and global information, we propose a lightweight cross-attention mechanism named lite cross-attention(LCAT). Besides, to meet the intraoperative requirements, we employ the intensity-based registration method CMA-ES to refine the result of pose regression.</p>\n </section>\n \n <section>\n \n <h3> Results</h3> \n <p>Our approach is verified on both real and simulated x-ray data. Experimental results show that compared with the existing learning-based registration methods, the median rotation error of our method can reach 1.9<span></span><math>\n <semantics>\n <msup>\n <mrow></mrow>\n <mo>∘</mo>\n </msup>\n <annotation>$^\\circ$</annotation>\n </semantics></math> and the median translation error can reach 5.6 mm in the case of a large search range. When evaluated on 52 real x-ray images, we have a median rotation error of 1.6<span></span><math>\n <semantics>\n <msup>\n <mrow></mrow>\n <mo>∘</mo>\n </msup>\n <annotation>$^\\circ$</annotation>\n </semantics></math> and a median translation error of 3.8 mm due to the smaller search range. We also verify the role of the LCAT and embedding reconstruction modules in our registration framework. If they are not used, our registration performance will be reduced to approximately random initialization results.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3> \n <p>During the experiments, our method demonstrates higher accuracy and larger capture range on both simulated images and real x-ray images compared to existing methods. The inspiring experimental results indicate the potential for future clinical application of our method.</p>\n </section>\n </div>","PeriodicalId":18384,"journal":{"name":"Medical physics","volume":"52 7","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automatic x-ray to CT registration using embedding reconstruction and lite cross-attention\",\"authors\":\"Tonglong Li, Minheng Chen, Mingying Li, Chuanyou Li, Youyong Kong\",\"doi\":\"10.1002/mp.17896\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3> \\n <p>The registration of intraoperative x-ray images with preoperative CT images is an important step in image-guided surgery. However, existing regression-based methods lack an interpretable and stable mechanism when fusing information from intraoperative images and preoperative CT volumes. In addition, existing feature extraction and fusion methods limit the accuracy of pose regression.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Purpose</h3> \\n <p>The objective of this study is to develop a method that leverages both x-ray and computed tomography (CT) images to rapidly and robustly estimate an accurate initial registration within a broad search space. This approach integrates the strengths of learning-based registration with those of traditional registration methodologies, enabling the acquisition of registration outcomes across a wide search space at an accelerated pace.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3> \\n <p>We introduce a regression-based registration framework to address the aforementioned issues. We constrain the feature fusion process by training the network to reconstruct the high-dimensional feature representation vector of the preoperative CT volume in the embedding space from the input single-view x-ray, thereby enhancing the interpretability of feature extraction. Also, in order to promote the effective fusion and better extraction of local texture features and global information, we propose a lightweight cross-attention mechanism named lite cross-attention(LCAT). Besides, to meet the intraoperative requirements, we employ the intensity-based registration method CMA-ES to refine the result of pose regression.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3> \\n <p>Our approach is verified on both real and simulated x-ray data. Experimental results show that compared with the existing learning-based registration methods, the median rotation error of our method can reach 1.9<span></span><math>\\n <semantics>\\n <msup>\\n <mrow></mrow>\\n <mo>∘</mo>\\n </msup>\\n <annotation>$^\\\\circ$</annotation>\\n </semantics></math> and the median translation error can reach 5.6 mm in the case of a large search range. When evaluated on 52 real x-ray images, we have a median rotation error of 1.6<span></span><math>\\n <semantics>\\n <msup>\\n <mrow></mrow>\\n <mo>∘</mo>\\n </msup>\\n <annotation>$^\\\\circ$</annotation>\\n </semantics></math> and a median translation error of 3.8 mm due to the smaller search range. We also verify the role of the LCAT and embedding reconstruction modules in our registration framework. If they are not used, our registration performance will be reduced to approximately random initialization results.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3> \\n <p>During the experiments, our method demonstrates higher accuracy and larger capture range on both simulated images and real x-ray images compared to existing methods. The inspiring experimental results indicate the potential for future clinical application of our method.</p>\\n </section>\\n </div>\",\"PeriodicalId\":18384,\"journal\":{\"name\":\"Medical physics\",\"volume\":\"52 7\",\"pages\":\"\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical physics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/mp.17896\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical physics","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/mp.17896","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
Automatic x-ray to CT registration using embedding reconstruction and lite cross-attention
Background
The registration of intraoperative x-ray images with preoperative CT images is an important step in image-guided surgery. However, existing regression-based methods lack an interpretable and stable mechanism when fusing information from intraoperative images and preoperative CT volumes. In addition, existing feature extraction and fusion methods limit the accuracy of pose regression.
Purpose
The objective of this study is to develop a method that leverages both x-ray and computed tomography (CT) images to rapidly and robustly estimate an accurate initial registration within a broad search space. This approach integrates the strengths of learning-based registration with those of traditional registration methodologies, enabling the acquisition of registration outcomes across a wide search space at an accelerated pace.
Methods
We introduce a regression-based registration framework to address the aforementioned issues. We constrain the feature fusion process by training the network to reconstruct the high-dimensional feature representation vector of the preoperative CT volume in the embedding space from the input single-view x-ray, thereby enhancing the interpretability of feature extraction. Also, in order to promote the effective fusion and better extraction of local texture features and global information, we propose a lightweight cross-attention mechanism named lite cross-attention(LCAT). Besides, to meet the intraoperative requirements, we employ the intensity-based registration method CMA-ES to refine the result of pose regression.
Results
Our approach is verified on both real and simulated x-ray data. Experimental results show that compared with the existing learning-based registration methods, the median rotation error of our method can reach 1.9 and the median translation error can reach 5.6 mm in the case of a large search range. When evaluated on 52 real x-ray images, we have a median rotation error of 1.6 and a median translation error of 3.8 mm due to the smaller search range. We also verify the role of the LCAT and embedding reconstruction modules in our registration framework. If they are not used, our registration performance will be reduced to approximately random initialization results.
Conclusions
During the experiments, our method demonstrates higher accuracy and larger capture range on both simulated images and real x-ray images compared to existing methods. The inspiring experimental results indicate the potential for future clinical application of our method.
期刊介绍:
Medical Physics publishes original, high impact physics, imaging science, and engineering research that advances patient diagnosis and therapy through contributions in 1) Basic science developments with high potential for clinical translation 2) Clinical applications of cutting edge engineering and physics innovations 3) Broadly applicable and innovative clinical physics developments
Medical Physics is a journal of global scope and reach. By publishing in Medical Physics your research will reach an international, multidisciplinary audience including practicing medical physicists as well as physics- and engineering based translational scientists. We work closely with authors of promising articles to improve their quality.