International Journal of Computer Assisted Radiology and Surgery最新文献_第3页

Adaptive sensitivity-fisher regularization for heterogeneous transfer learning of vascular segmentation in laparoscopic videos. 基于自适应灵敏度fisher正则化的腹腔镜视频血管分割异构迁移学习。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-06 DOI: 10.1007/s11548-025-03404-2

Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kazunari Misawa, Kensaku Mori

{"title":"Adaptive sensitivity-fisher regularization for heterogeneous transfer learning of vascular segmentation in laparoscopic videos.","authors":"Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kazunari Misawa, Kensaku Mori","doi":"10.1007/s11548-025-03404-2","DOIUrl":"https://doi.org/10.1007/s11548-025-03404-2","url":null,"abstract":"Purpose: This study aims to enhance surgical safety by developing a method for vascular segmentation in laparoscopic surgery videos with limited visibility. We introduce an adaptive sensitivity-fisher regularization (ASFR) approach to adapt neural networks, initially trained on non-medical datasets, for vascular segmentation in laparoscopic videos.Methods: Our approach utilizes heterogeneous transfer learning by integrating fisher information and sensitivity analysis to mitigate catastrophic forgetting and overfitting caused by limited annotated data in laparoscopic videos. We calculate fisher information to identify and preserve critical model parameters while using sensitivity measures to guide adjustment for new task.Results: The fine-tuned models demonstrated high accuracy in vascular segmentation across various complex video sequences, including those with obscured vessels. For both invisible and visible vessels, our method achieved an average Dice score of 41.3. In addition to outperforming traditional transfer learning approaches, our method exhibited strong adaptability across multiple advanced video segmentation architectures.Conclusion: This study introduces a novel heterogeneous transfer learning approach, ASFR, which significantly enhances the precision of vascular segmentation in laparoscopic videos. ASFR effectively addresses critical challenges in surgical image analysis and paves the way for broader applications in laparoscopic surgery, promising improved patient outcomes and increased surgical efficiency.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144235858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing surgical efficiency with an automated scrub nurse robot: a focus on automatic instrument insertion. 用自动擦洗护士机器人提高手术效率：专注于自动器械插入。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-06 DOI: 10.1007/s11548-025-03433-x

Kitaro Yoshimitsu, Ken Masamune, Fujio Miyawaki

{"title":"Enhancing surgical efficiency with an automated scrub nurse robot: a focus on automatic instrument insertion.","authors":"Kitaro Yoshimitsu, Ken Masamune, Fujio Miyawaki","doi":"10.1007/s11548-025-03433-x","DOIUrl":"https://doi.org/10.1007/s11548-025-03433-x","url":null,"abstract":"Purpose: To address the chronic shortage of skilled scrub nurses, we propose the development of a scrub nurse robot (SNR). This paper describes the third-generation of our SNR, which implements the automatic insertion of surgical instruments (AISI). We focused on optimizing the instrument provision part of the instrument exchange task, which is a crucial role of the scrub nurse.Methods: The third-generation SNR detects the moment when an operating surgeon withdraws an instrument after use from the trocar cannula, automatically conveys the next instrument to the cannula, and inserts only its tip into the cannula. Thereafter, the surgeon is required to grip the instrument and to push it fully into the cannula. This robotic function is designated as AISI. The following three combinations were compared: (1) third-generation SNR and surgeon stand-ins in a laboratory experiment, (2) three human scrub nurses and a skilled expert surgeon in three real surgical cases, (3) second-generation SNR and surgeon stand-ins in a laboratory experiment.Results: The third-generation SNR and surgeon stand-ins were 53% slower and 34% faster, respectively, in targeting the instruments during the instrument exchange sequence compared with the actual OR nurse-surgeon pair and the second-generation SNR-stand-in pair. The average \"eyes-off\" time of the stand-ins assisted by the third-generation SNR was 0.41 s (0 s in 92 out of 138 trials), whereas that of the real surgeon in clinical cases had a mean of 1.47 (N = 138) (range, 0.69-7.24 s) when using the second-generation SNR.Conclusion: Third-generation SNR with AISI can enhance operative efficiency by contributing to smooth instrument exchange, which enhances the surgeon's ability to concentrate on a surgical procedure without interrupting the intraoperative surgical rhythm.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144235925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cell generation with label evolution diffusion and class mask self-attention. 具有标签进化扩散和类掩码自关注的细胞生成。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-06 DOI: 10.1007/s11548-025-03443-9

Wen Jing, Zixiang Jin, Yi Zhang, Guoxia Xu, Meng Zhao

{"title":"Cell generation with label evolution diffusion and class mask self-attention.","authors":"Wen Jing, Zixiang Jin, Yi Zhang, Guoxia Xu, Meng Zhao","doi":"10.1007/s11548-025-03443-9","DOIUrl":"https://doi.org/10.1007/s11548-025-03443-9","url":null,"abstract":"Purpose: Due to the relative difficulty in acquiring histopathological images, the generated cell morphology often presents a fixed pattern and lacks diversity. To this end, we propose the first diffusion generation model based on point diffusion, which can capture the changes and diversity of cell morphology in more detail.Methods: By gradually updating the information of cell morphology during the generation process, we can effectively guide the diffusion model to generate more diverse and realistic cell images. In addition, we introduce a class mask self-attention module to constrain the cell types generated by the diffusion model.Results: We conducted experiments on the public dataset Lizard, and comparative analysis with previous image generation methods showed that our method has excellent performance. Compared with the latest NASDM network, our method achieves a 43.17% improvement in FID and a 46.24% enhancement in IS.Conclusions: We proposed a first-of-its-kind diffusion model that combines point diffusion and class mask self-attention mechanisms. The model can effectively generate diverse data while maintaining the high quality of generated images and performs well.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144235923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction: Multi-modal dataset creation for federated learning with DICOM-structured reports. 更正：使用dicom结构报告创建用于联邦学习的多模态数据集。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-04 DOI: 10.1007/s11548-025-03409-x

Malte Tölle, Lukas Burger, Halvar Kelm, Florian André, Peter Bannas, Gerhard Diller, Norbert Frey, Philipp Garthe, Stefan Groß, Anja Hennemuth, Lars Kaderali, Nina Krüger, Andreas Leha, Simon Martin, Alexander Meyer, Eike Nagel, Stefan Orwat, Clemens Scherer, Moritz Seiffert, Jan Moritz Seliger, Stefan Simm, Tim Friede, Tim Seidler, Sandy Engelhardt

引用次数: 0

Evaluating virtual reality as a tool for improving surgical planning in spinal tumors. 评估虚拟现实作为改进脊柱肿瘤手术计划的工具。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-04 DOI: 10.1007/s11548-025-03440-y

Tina Nomena Herimino Nantenaina, Andrey Titov, Sung-Joo Yuh, Simon Drouin

{"title":"Evaluating virtual reality as a tool for improving surgical planning in spinal tumors.","authors":"Tina Nomena Herimino Nantenaina, Andrey Titov, Sung-Joo Yuh, Simon Drouin","doi":"10.1007/s11548-025-03440-y","DOIUrl":"https://doi.org/10.1007/s11548-025-03440-y","url":null,"abstract":"Purpose: Surgical planning is essential before the surgery, especially for spinal tumors resection. During the surgical planning, medical images are analyzed by surgeon on a standardized display mode using a computer. But this display mode has its limit in term of spatial perception of the anatomical structures. Our purpose in this study is to assess the impact of using another display mode like virtual reality (VR) on the surgical planning of spinal tumors resection by comparing VR with conventional computer-based visualization.Methods: A user study was conducted with eight neurosurgeons, who planned six spinal tumor surgeries using both VR and computer visualization modalities. The evaluation focused on the perception of anatomical-functional information from medical images, the identification of anatomical structures, and the selection of surgical approaches represented by the number of anatomical structures traversed to attend the tumor. These parameters were assessed using objective questionnaires developed from a work domain analysis (WDA) already proved in brain surgery. We then adapted the WDA to spinal surgery.Results: VR made it easier to perceive a greater number of anatomical-functional information compared to computer visualization. Surgeons identified a greater anatomical structure with VR compared to computer visualization. Furthermore, surgeons selected additional anatomical structures to be traversed to reach the tumor when using VR, leading to a more precise selection of surgical approaches. These findings can predict the added value of VR in helping surgical decision-making when planning surgery.Conclusion: VR can be a promising tool for surgical planning by providing an immersive and interactive perspective that enhances understanding of anatomy. However, our finding is from an exploratory study, more clinical cases should be conducted to demonstrate its feasibility and reliability.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144227501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SurgIPC: a convex image perspective correction method to boost surgical keypoint matching. SurgIPC：一种增强手术关键点匹配的凸图像透视校正方法。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-03 DOI: 10.1007/s11548-025-03411-3

Rasoul Sharifian, Adrien Bartoli

{"title":"SurgIPC: a convex image perspective correction method to boost surgical keypoint matching.","authors":"Rasoul Sharifian, Adrien Bartoli","doi":"10.1007/s11548-025-03411-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03411-3","url":null,"abstract":"Purpose: Keypoint detection and matching is a fundamental step in surgical image analysis. However, existing methods are not perspective invariant and thus degrade with increasing surgical camera motion amplitude. One approach to address this problem is by warping the image before keypoint detection. However, existing warping methods are inapplicable to surgical images, as they make unrealistic assumptions such as scene planarity.Methods: We propose Surgical Image Perspective Correction (SurgIPC), a convex method, specifically a linear least-squares (LLS) one, overcoming the above limitations. Using a depth map, SurgIPC warps the image to deal with the perspective effect. The warp exploits the theory of conformal flattening: it attempts to preserve the angles measured on the depth map and after warping, while mitigating the effects of image resampling.Results: We evaluate SurgIPC under controlled conditions using a liver phantom with ground-truth camera poses and with real surgical images. The results demonstrate a significant improvement in the number of correct correspondences when SurgIPC is applied. Furthermore, experiments on downstream tasks, including keyframe matching and 3D reconstruction using structure-from-motion (SfM), highlight significant performance gains.Conclusion: SurgIPC improves keypoint matching. The use of LLS ensures efficient and reliable computations. SurgIPC can thus be easily included in existing computer-aided surgery systems.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144217479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Intelligent control of robotic X-ray devices using a language-promptable digital twin. 使用语言提示数字双胞胎智能控制机器人x射线设备。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-01 Epub Date: 2025-04-09 DOI: 10.1007/s11548-025-03351-y

Benjamin D Killeen, Anushri Suresh, Catalina Gomez, Blanca Íñigo, Christopher Bailey, Mathias Unberath

{"title":"Intelligent control of robotic X-ray devices using a language-promptable digital twin.","authors":"Benjamin D Killeen, Anushri Suresh, Catalina Gomez, Blanca Íñigo, Christopher Bailey, Mathias Unberath","doi":"10.1007/s11548-025-03351-y","DOIUrl":"10.1007/s11548-025-03351-y","url":null,"abstract":"Purpose: Natural language offers a convenient, flexible interface for controlling robotic C-arm X-ray systems, making advanced functionality and controls easily accessible.Please confirm if the author names are presented accurately and in the correct sequence (given name, middle name/initial, family name). Author 1 Given name: [Benjamin D.] Last name [Killeen]. Also, kindly confirm the details in the metadata are correct. However, enabling language interfaces requires specialized artificial intelligence (AI) models that interpret X-ray images to create a semantic representation for language-based reasoning. The fixed outputs of such AI models fundamentally limits the functionality of language controls that users may access. Incorporating flexible and language-aligned AI models that can be prompted through language control facilitates more flexible interfaces for a much wider variety of tasks and procedures.Methods: Using a language-aligned foundation model for X-ray image segmentation, our system continually updates a patient digital twin based on sparse reconstructions of desired anatomical structures. This allows for multiple autonomous capabilities, including visualization, patient-specific viewfinding, and automatic collimation from novel viewpoints, enabling complex language control commands like \"Focus in on the lower lumbar vertebrae.\"Results: In a cadaver study, multiple users were able to visualize, localize, and collimate around structures across the torso region using only verbal commands to control a robotic X-ray system, with 84% end-to-end success. In post hoc analysis of randomly oriented images, our patient digital twin was able to localize 35 commonly requested structures from a given image to within <math><mrow><mn>51.68</mn> <mo>±</mo> <mn>30.84</mn></mrow> </math> mm, which enables localization and isolation of the object from arbitrary orientations.Conclusion: Overall, we show how intelligent robotic X-ray systems can incorporate physicians' expressed intent directly. Existing foundation models for intra-operative X-ray image analysis exhibit certain failure modes. Nevertheless, our results suggest that as these models become more capable, they can facilitate highly flexible, intelligent robotic C-arms.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1125-1134"},"PeriodicalIF":2.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144020481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatiotemporally constrained 3D reconstruction from biplanar digital subtraction angiography. 双平面数字减影血管造影的时空约束三维重建。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-01 DOI: 10.1007/s11548-025-03427-9

Sarah Frisken, Vivek Gopalakrishnan, David Dimitris Chlorogiannis, Nazim Haouchine, Alexandre Cafaro, Alexandra J Golby, William M Wells Iii, Rose Du

{"title":"Spatiotemporally constrained 3D reconstruction from biplanar digital subtraction angiography.","authors":"Sarah Frisken, Vivek Gopalakrishnan, David Dimitris Chlorogiannis, Nazim Haouchine, Alexandre Cafaro, Alexandra J Golby, William M Wells Iii, Rose Du","doi":"10.1007/s11548-025-03427-9","DOIUrl":"https://doi.org/10.1007/s11548-025-03427-9","url":null,"abstract":"Purpose: Our goal is to reconstruct 3D cerebral vessels from two 2D digital subtraction angiography (DSA) images acquired using a biplane scanner. This could provide intraoperative 3D imaging with 2-5 × spatial and 20 × temporal resolution of 3D magnetic resonance angiography, computed tomography angiography (CTA), or rotational DSA. Because many interventional radiology suites have biplane scanners, our method could be easily integrated into clinical workflows.Methods: We present a constrained 3D reconstruction method that utilizes vessel centerlines, radii, and the flow of contrast agent through vessels from DSA. The reconstructed volume samples 'vesselness' at each voxel, i.e., its probability of containing a vessel. We present evaluation metrics which we used to optimize reconstruction parameters and evaluate our method on synthetic data. We provide preliminary results on clinical data. To handle clinical data, we developed a software tool for extracting vessel centerlines, radii, and contrast arrival times from clinical DSA. We provide an automated method for registering DSA to CTA which allows us to compare reconstructed vessels with vessels extracted from CTA.Result: Our method reduced reconstruction artifacts in vesselness volumes for both synthetic and clinical data. In synthetic DSA, where 3D ground-truth vessel centerlines are available, our constrained reconstruction method improved accuracy, selectivity, and Dice scores with two views compared to existing sparse reconstruction methods with up to 16 views.Conclusion: Incorporating additional constraints into 3D reconstruction can successfully reduce artifacts introduced when a complex 3D structure like the brain vasculature is reconstructed from a small number of 2D views.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144200762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video-based multi-target multi-camera tracking for postoperative phase recognition. 基于视频的多目标多摄像机术后相位识别。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-01 Epub Date: 2025-04-12 DOI: 10.1007/s11548-025-03344-x

Franziska Jurosch, Janik Zeller, Lars Wagner, Ege Özsoy, Alissa Jell, Sven Kolb, Dirk Wilhelm

{"title":"Video-based multi-target multi-camera tracking for postoperative phase recognition.","authors":"Franziska Jurosch, Janik Zeller, Lars Wagner, Ege Özsoy, Alissa Jell, Sven Kolb, Dirk Wilhelm","doi":"10.1007/s11548-025-03344-x","DOIUrl":"10.1007/s11548-025-03344-x","url":null,"abstract":"Purpose: Deep learning methods are commonly used to generate context understanding to support surgeons and medical professionals. By expanding the current focus beyond the operating room (OR) to postoperative workflows, new forms of assistance are possible. In this article, we propose a novel multi-target multi-camera tracking (MTMCT) architecture for postoperative phase recognition, location tracking, and automatic timestamp generation.Methods: Three RGB cameras were used to create a multi-camera data set containing 19 reenacted postoperative patient flows. Patients and beds were annotated and used to train the custom MTMCT architecture. It includes bed and patient tracking for each camera and a postoperative patient state module to provide the postoperative phase, current location of the patient, and automatically generated timestamps.Results: The architecture demonstrates robust performance for single- and multi-patient scenarios by embedding medical domain-specific knowledge. In multi-patient scenarios, the state machine representing the postoperative phases has a traversal accuracy of <math><mrow><mn>84.9</mn> <mo>±</mo> <mn>6.0</mn> <mo>%</mo></mrow> </math> , <math><mrow><mn>91.4</mn> <mo>±</mo> <mn>1.5</mn> <mo>%</mo></mrow> </math> of timestamps are generated correctly, and the patient tracking IDF1 reaches <math><mrow><mn>92.0</mn> <mo>±</mo> <mn>3.6</mn> <mo>%</mo></mrow> </math> . Comparative experiments show the effectiveness of using AFLink for matching partial trajectories in postoperative settings.Conclusion: As our approach shows promising results, it lays the foundation for real-time surgeon support, enhancing clinical documentation and ultimately improving patient care.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1159-1166"},"PeriodicalIF":2.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167295/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144045515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stable distance regression via spatial-frequency state space model for robot-assisted endomicroscopy. 基于空间-频率状态-空间模型的机器人内镜稳定距离回归。

IF 2.3 3区医学

International Journal of Computer Assisted Radiology and Surgery Pub Date : 2025-06-01 Epub Date: 2025-04-12 DOI: 10.1007/s11548-025-03353-w

Mengyi Zhou, Chi Xu, Stamatia Giannarou

{"title":"Stable distance regression via spatial-frequency state space model for robot-assisted endomicroscopy.","authors":"Mengyi Zhou, Chi Xu, Stamatia Giannarou","doi":"10.1007/s11548-025-03353-w","DOIUrl":"10.1007/s11548-025-03353-w","url":null,"abstract":"Purpose: Probe-based confocal laser endomicroscopy (pCLE) is a noninvasive technique that enables the direct visualization of tissue at a microscopic level in real time. One of the main challenges in using pCLE is maintaining the probe within a working range of micrometer scale. As a result, the need arises for automatically regressing the probe-tissue distance to enable precise robotic tissue scanning.Methods: In this paper, we propose the spatial frequency bidirectional structured state space model (SF-BiS4D) for pCLE probe-tissue distance regression. This model advances traditional state space models by processing image sequences bidirectionally and analyzing data in both the frequency and spatial domains. Additionally, we introduce a guided trajectory planning strategy that generates pseudo-distance labels, facilitating the training of sequential models to generate smooth and stable robotic scanning trajectories. To improve inference speed, we also implement a hierarchical guided fine-tuning (GF) approach that efficiently reduces the size of the BiS4D model while maintaining performance.Results: The performance of our proposed model has been evaluated both qualitatively and quantitatively using the pCLE regression dataset (PRD). In comparison with existing state-of-the-art (SOTA) methods, our approach demonstrated superior performance in terms of accuracy and stability.Conclusion: Our proposed deep learning-based framework effectively improves distance regression for microscopic visual servoing and demonstrates its potential for integration into surgical procedures requiring precise real-time intraoperative imaging.","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":"1167-1174"},"PeriodicalIF":2.3,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12167353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144063151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0