Hizirwan S Salim, Abdullah Thabit, Sem Hoogteijling, Maryse A van 't Klooster, Theo van Walsum, Maeike Zijlmans, Mohamed Benmahdjoub
{"title":"Super-resolution for localizing electrode grids as small, deformable objects during epilepsy surgery using augmented reality headsets.","authors":"Hizirwan S Salim, Abdullah Thabit, Sem Hoogteijling, Maryse A van 't Klooster, Theo van Walsum, Maeike Zijlmans, Mohamed Benmahdjoub","doi":"10.1007/s11548-025-03401-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03401-5","url":null,"abstract":"<p><strong>Purpose: </strong>Epilepsy surgery is a potential curative treatment for people with focal epilepsy. Intraoperative electrocorticogram (ioECoG) recordings from the brain guide neurosurgeons during resection. Accurate localization of epileptic activity and thus the ioECoG grids is critical for successful outcomes. We aim to develop and evaluate the feasibility of a novel method for localizing small, deformable objects using augmented reality (AR) head-mounted displays (HMDs) and artificial intelligence (AI). AR HMDs combine cameras and patient overlay visualization in a compact design.</p><p><strong>Methods: </strong>We developed an image processing method for the HoloLens 2 to localize a 64-electrode ioECoG grid even when individual electrodes are indistinguishable due to low resolution. The method combines object detection, super-resolution, and pose estimation AI models with stereo triangulation. A synthetic dataset of 90,000 images trained the super-resolution and pose estimation models. The system was tested in a controlled environment against an optical tracker as ground truth. Accuracy was evaluated at distances between 40 and 90 cm.</p><p><strong>Results: </strong>The system achieved sub-5 mm accuracy in localizing the ioECoG grid at distances shorter than 60 cm. At 40 cm, the accuracy remained below 2 mm, with an average standard deviation of less than 0.5 mm. At 60 cm the method processed on average 24 stereo frames per second.</p><p><strong>Conclusion: </strong>This study demonstrates the feasibility of localizing small, deformable objects like ioECoG grids using AR HMDs. While results indicate clinically acceptable accuracy, further research is needed to validate the method in clinical environments and assess its impact on surgical precision and outcomes.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A A Borisov, S S Semenov, Yu S Kirpichev, K M Arzamasov, O V Omelyanskaya, A V Vladzymyrskyy, Yu A Vasilev
{"title":"Quality control system for patient positioning and filling in meta-information for chest X-ray examinations.","authors":"A A Borisov, S S Semenov, Yu S Kirpichev, K M Arzamasov, O V Omelyanskaya, A V Vladzymyrskyy, Yu A Vasilev","doi":"10.1007/s11548-025-03468-0","DOIUrl":"https://doi.org/10.1007/s11548-025-03468-0","url":null,"abstract":"<p><strong>Purpose: </strong>During radiography, irregularities occur, leading to decrease in the diagnostic value of the images obtained. The purpose of this work was to develop a system for automated quality assurance of patient positioning in chest radiographs, with detection of suboptimal contrast, brightness, and metadata errors.</p><p><strong>Methods: </strong>The quality assurance system was trained and tested using more than 69,000 X-rays of the chest and other anatomical areas from the Unified Radiological Information Service (URIS) and several open datasets. Our dataset included studies regardless of a patient's gender and race, while the sole exclusion criterion being age below 18 years. A training dataset of radiographs labeled by expert radiologists was used to train an ensemble of modified deep convolutional neural networks architectures ResNet152V2 and VGG19 to identify various quality deficiencies. Model performance was accessed using area under the receiver operating characteristic curve (ROC-AUC), precision, recall, F1-score, and accuracy metrics.</p><p><strong>Results: </strong>Seven neural network models were trained to classify radiographs by the following quality deficiencies: failure to capture the target anatomic region, chest rotation, suboptimal brightness, incorrect anatomical area, projection errors, and improper photometric interpretation. All metrics for each model exceed 95%, indicating high predictive value. All models were combined into a unified system for evaluating radiograph quality. The processing time per image is approximately 3 s.</p><p><strong>Conclusion: </strong>The system supports multiple use cases: integration into an automated radiographic workstations, external quality assurance system for radiology departments, acquisition quality audits for municipal health systems, and routing of studies to diagnostic AI models.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144327674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niklas Agethen, Janis Rosskamp, Tom L Koller, Jan Klein, Gabriel Zachmann
{"title":"Recurrent multi-view 6DoF pose estimation for marker-less surgical tool tracking.","authors":"Niklas Agethen, Janis Rosskamp, Tom L Koller, Jan Klein, Gabriel Zachmann","doi":"10.1007/s11548-025-03436-8","DOIUrl":"https://doi.org/10.1007/s11548-025-03436-8","url":null,"abstract":"<p><strong>Purpose: </strong>Marker-based tracking of surgical instruments facilitates surgical navigation systems with high precision, but requires time-consuming preparation and is prone to stains or occluded markers. Deep learning promises marker-less tracking based solely on RGB videos to address these challenges. In this paper, object pose estimation is applied to surgical instrument tracking using a novel deep learning architecture.</p><p><strong>Methods: </strong>We combine pose estimation from multiple views with recurrent neural networks to better exploit temporal coherence for improved tracking. We also investigate the performance under conditions where the instrument is obscured. We enhance an existing pose (distribution) estimation pipeline by a spatio-temporal feature extractor that allows for feature incorporation along an entire sequence of frames.</p><p><strong>Results: </strong>On a synthetic dataset we achieve a mean tip error below 1.0 mm and an angle error below 0.2 <math><mmultiscripts><mrow></mrow> <mrow></mrow> <mo>∘</mo></mmultiscripts> </math> using a four-camera setup. On a real dataset with four cameras we achieve an error below 3.0 mm. Under limited instrument visibility our recurrent approach can predict the tip position approximately 3 mm more precisely than the non-recurrent approach.</p><p><strong>Conclusion: </strong>Our findings on a synthetic dataset of surgical instruments demonstrate that deep-learning-based tracking using multiple cameras simultaneously can be competitive with marker-based systems. Additionally, the temporal information obtained through the architecture's recurrent nature is advantageous when the instrument is occluded. The synthesis of multi-view and recurrence has thus been shown to enhance the reliability and usability of high-precision surgical pose estimation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144318641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Camiel J Smees, Judith Olde Heuvel, Stein van der Heide, Esmee D van Uum, Anne J H Vochteloo, Gabriëlle J M Tuijthof
{"title":"A shape completion model for corrective osteotomy of distal radius malunion.","authors":"Camiel J Smees, Judith Olde Heuvel, Stein van der Heide, Esmee D van Uum, Anne J H Vochteloo, Gabriëlle J M Tuijthof","doi":"10.1007/s11548-025-03454-6","DOIUrl":"https://doi.org/10.1007/s11548-025-03454-6","url":null,"abstract":"<p><strong>Purpose: </strong>When performing 3D planning for osteotomies in patients with distal radius malunion, the contralateral radius is commonly used as a template for reconstruction. However, in approximately 10% of the cases, the contralateral radius is not suitable for use. A shape completion model may provide an alternative by generating a healthy radius model based on the proximal part of the malunited bone. The aim of this study is to develop and clinically evaluate such a shape completion model.</p><p><strong>Method: </strong>A total of 100 segmented CT scans of healthy radii were used, with 80 scans used to train a statistical shape model (SSM). This SSM formed the base for a shape completion model capable of predicting the distal 12% based on the proximal 88%. Hyperparameters were optimized using 10 segmented 3D models, and the remaining 10 models were reserved for testing the performance of the shape completion model.</p><p><strong>Results: </strong>The shape completion model consistently produced clinically viable 3D reconstructions. The mean absolute errors between the predicted and corresponding reference models in the rotational errors were 2.6 ± 1.7° for radial inclination, 3.6 ± 2.2° for volar tilt, and 2.6 ± 2.8° for axial rotation. Translational errors were 0.7 ± 0.6 mm in dorsal shift, 0.8 ± 0.5 mm in radial shift, and 1.7 ± 1.1 mm in lengthening.</p><p><strong>Conclusion: </strong>This study successfully developed a shape completion model capable of reconstructing healthy 3D radius models based on the proximal bone. The observed errors indicate that the model is viable for use in 3D planning for patients lacking a healthy contralateral radius. However, routine use in patients with a healthy contralateral radius is not yet advised, as error margins exceed bilateral differences observed in healthy populations. The most clinically relevant error found in the model, length mismatch, can be easily corrected during 3D planning if the ipsilateral ulna remains intact.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144318639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MHAHF-UNet: a multi-scale hybrid attention hierarchy fusion network for carotid artery segmentation.","authors":"Changshuo Jiang, Lin Gao, Wei Li, Maoyang Zou, Qingxiao Zheng, Xuhua Qiao","doi":"10.1007/s11548-025-03449-3","DOIUrl":"https://doi.org/10.1007/s11548-025-03449-3","url":null,"abstract":"<p><strong>Purpose: </strong>Carotid plaque is an early manifestation of carotid atherosclerosis, and its accurate segmentation helps to assess cardiovascular disease risk. However, existing carotid artery segmentation algorithms are difficult to accurately capture the structural features of morphologically diverse plaques and lack effective utilization of multilayer features.</p><p><strong>Methods: </strong>In order to solve the above problems, this paper proposes a multi-scale hybrid attention hierarchical fusion U-network structure (MHAHF-UNet) for segmenting ambiguous plaques in carotid artery images in order to improve the segmentation accuracy for complex structured images. The structure firstly introduces the median-enhanced orthogonal convolution module (MEOConv), which not only effectively suppresses the noise interference in ultrasound images, but also maintains the ability to perceive multi-scale features by combining the median-enhanced ternary channel mechanism and the depth-orthogonal convolution space mechanism. Secondly, it adopts the multi-fusion group convolutional gating module, which realizes the effective integration of shallow detailed features and deep semantic features through the adaptive control strategy of group convolution, and is able to flexibly regulate the transfer weights of features at different levels.</p><p><strong>Results: </strong>Experiments show that the MHAHF-UNet model achieves a Dice coefficient of <math><mrow><mn>82.46</mn> <mo>±</mo> <mn>0.31</mn> <mo>%</mo></mrow> </math> and an IOU of <math><mrow><mn>71.45</mn> <mo>±</mo> <mn>0.37</mn> <mo>%</mo></mrow> </math> in the carotid artery segmentation task.</p><p><strong>Conclusion: </strong>The model is expected to provide strong support for the prevention and treatment of cardiovascular diseases.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144318640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenxu Wang, Zhenyuan Ning, Jifan Zhang, Yu Zhang, Weizhen Wang
{"title":"Interpretable deep fuzzy network-aided detection of central lymph node metastasis status in papillary thyroid carcinoma.","authors":"Wenxu Wang, Zhenyuan Ning, Jifan Zhang, Yu Zhang, Weizhen Wang","doi":"10.1007/s11548-025-03453-7","DOIUrl":"https://doi.org/10.1007/s11548-025-03453-7","url":null,"abstract":"<p><strong>Purpose: </strong>The non-invasive assessment of central lymph node metastasis (CLNM) in patients with papillary thyroid carcinoma (PTC) plays a crucial role in assisting treatment decision and prognosis planning. This study aims to use an interpretable deep fuzzy network guided by expert knowledge to predict the CLNM status of patients with PTC from ultrasound images.</p><p><strong>Methods: </strong>A total of 1019 PTC patients were enrolled in this study, comprising 465 CLNM patients and 554 non-CLNM patients. Pathological diagnosis served as the gold standard to determine metastasis status. Clinical and morphological features of thyroid were collected as expert knowledge to guide the deep fuzzy network in predicting CLNM status. The network consisted of a region of interest (ROI) segmentation module, a knowledge-aware feature extraction module, and a fuzzy prediction module. The network was trained on 652 patients, validated on 163 patients and tested on 204 patients.</p><p><strong>Results: </strong>The model exhibited promising performance in predicting CLNM status, achieving the area under the receiver operating characteristic curve (AUC), accuracy, precision, sensitivity and specificity of 0.786 (95% CI 0.720-0.846), 0.745 (95% CI 0.681-0.799), 0.727 (95% CI 0.636-0.819), 0.696 (95% CI 0.594-0.789), and 0.786 (95% CI 0.712-0.864), respectively. In addition, the rules of the fuzzy system in the model are easy to understand and explain, and have good interpretability.</p><p><strong>Conclusion: </strong>The deep fuzzy network guided by expert knowledge predicted CLNM status of PTC patients with high accuracy and good interpretability, and may be considered as an effective tool to guide preoperative clinical decision-making.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144303545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kensaku Mori
{"title":"Diffusion-driven distillation and contrastive learning for class-incremental semantic segmentation of laparoscopic images.","authors":"Xinkai Zhao, Yuichiro Hayashi, Masahiro Oda, Takayuki Kitasaka, Kensaku Mori","doi":"10.1007/s11548-025-03405-1","DOIUrl":"https://doi.org/10.1007/s11548-025-03405-1","url":null,"abstract":"<p><strong>Purpose: </strong>Understanding anatomical structures in laparoscopic images is crucial for various types of laparoscopic surgery. However, creating specialized datasets for each type is both inefficient and challenging. This highlights the clinical significance of exploring class-incremental semantic segmentation (CISS) for laparoscopic images. Although CISS has been widely studied in diverse image datasets, in clinical settings, incremental data typically consists of new patient images rather than reusing previous images, necessitating a novel algorithm.</p><p><strong>Methods: </strong>We introduce a distillation approach driven by a diffusion model for CISS of laparoscopic images. Specifically, an unconditional diffusion model is trained to generate synthetic laparoscopic images, which are then incorporated into subsequent training steps. A distillation network is employed to extract and transfer knowledge from networks trained in earlier steps. Additionally, to address the challenge posed by the limited semantic information available in individual laparoscopic images, we employ cross-image contrastive learning, enhancing the model's ability to distinguish subtle variations across images.</p><p><strong>Results: </strong>Our method was trained and evaluated on all 11 anatomical structures from the Dresden Surgical Anatomy Dataset, which presents significant challenges due to its dispersed annotations. Extensive experiments demonstrate that our approach outperforms other methods, especially in difficult categories such as the ureter and vesicular glands, where it surpasses even supervised offline learning.</p><p><strong>Conclusion: </strong>This study is the first to address class-incremental semantic segmentation for laparoscopic images, significantly improving the adaptability of segmentation models to new anatomical classes in surgical procedures.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144295356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taylor Frantz, Frederick van Gestel, Pieter Slagmolen, Johnny Duerinck, Thierry Scheerlinck, Jef Vandemeulebroucke
{"title":"Evaluation of augmented reality guidance for glenoid pin placement in total shoulder arthroplasty.","authors":"Taylor Frantz, Frederick van Gestel, Pieter Slagmolen, Johnny Duerinck, Thierry Scheerlinck, Jef Vandemeulebroucke","doi":"10.1007/s11548-025-03444-8","DOIUrl":"https://doi.org/10.1007/s11548-025-03444-8","url":null,"abstract":"<p><strong>Purpose: </strong>Computer-aided navigation and patient-specific 3D printed guides have demonstrated superior outcomes in total shoulder arthroplasty (TSA). Nevertheless, few TSAs are inserted using these technologies. Head-worn augmented reality (AR) devices can provide intuitive 3D computer navigation to the surgeon. This study investigates AR navigation in conjunction with adaptive spatial drift correction toward TSA.</p><p><strong>Methods: </strong>A phantom study was performed to assess the performance of AR navigated pin placement in TSA. Two medical experts performed a total of 12 pin placements into phantom scapula; six were placed using an end-to-end AR-navigated technique, and six using a common freehand technique. Inside-out infrared (IR) tracking was designed and integrated into the AR headset to correct for device drift and provide tool tracking. Additionally, the impact of IR tool tracking, registration, and superposed/juxtaposed visualization techniques was investigated.</p><p><strong>Results: </strong>The AR-navigated pin placement resulted in a mean entry point error of 1.06 mm ± 0.64 mm and directional error of <math><mrow><mn>1</mn> <mo>.</mo> <msup><mn>66</mn> <mo>∘</mo></msup> <mo>±</mo> <mn>0</mn> <mo>.</mo> <msup><mn>65</mn> <mo>∘</mo></msup> </mrow> </math> . Compared with the freehand technique, AR navigation resulted in improved directional outcomes ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.03</mn></mrow> </math> ), while entry point accuracy was not significantly different ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.44</mn></mrow> </math> ). IR tool tracking error was 1.47 mm ± 0.69 mm and <math><mrow><mn>0</mn> <mo>.</mo> <msup><mn>92</mn> <mo>∘</mo></msup> <mo>±</mo> <mn>0</mn> <mo>.</mo> <msup><mn>50</mn> <mo>∘</mo></msup> </mrow> </math> , and registration error was 4.32 mm ± 1.75 mm and <math><mrow><mn>2</mn> <mo>.</mo> <msup><mn>56</mn> <mo>∘</mo></msup> <mo>±</mo> <mn>0</mn> <mo>.</mo> <msup><mn>82</mn> <mo>∘</mo></msup> </mrow> </math> . No statistical difference between AR visualization techniques was found in entry point ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.22</mn></mrow> </math> ) or directional ( <math><mrow><mi>p</mi> <mo>=</mo> <mn>0.31</mn></mrow> </math> ) errors.</p><p><strong>Conclusion: </strong>AR navigation allowed for comparable pin placement outcomes with those reported in the literature for patient-specific 3D printed guides; moreover, it complements the patient-specific planning without the need for the guides themselves.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raphaela Maerkl, Tobias Rueckert, David Rauber, Max Gutbrod, Danilo Weber Nunes, Christoph Palm
{"title":"Enhancing generalization in zero-shot multi-label endoscopic instrument classification.","authors":"Raphaela Maerkl, Tobias Rueckert, David Rauber, Max Gutbrod, Danilo Weber Nunes, Christoph Palm","doi":"10.1007/s11548-025-03439-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03439-5","url":null,"abstract":"<p><strong>Purpose: </strong>Recognizing previously unseen classes with neural networks is a significant challenge due to their limited generalization capabilities. This issue is particularly critical in safety-critical domains such as medical applications, where accurate classification is essential for reliability and patient safety. Zero-shot learning methods address this challenge by utilizing additional semantic data, with their performance relying heavily on the quality of the generated embeddings.</p><p><strong>Methods: </strong>This work investigates the use of full descriptive sentences, generated by a Sentence-BERT model, as class representations, compared to simpler category-based word embeddings derived from a BERT model. Additionally, the impact of z-score normalization as a post-processing step on these embeddings is explored. The proposed approach is evaluated on a multi-label generalized zero-shot learning task, focusing on the recognition of surgical instruments in endoscopic images from minimally invasive cholecystectomies.</p><p><strong>Results: </strong>The results demonstrate that combining sentence embeddings and z-score normalization significantly improves model performance. For unseen classes, the AUROC improves from 43.9 % to 64.9 %, and the multi-label accuracy from 26.1 % to 79.5 %. Overall performance measured across both seen and unseen classes improves from 49.3 % to 64.9 % in AUROC and from 37.3 % to 65.1 % in multi-label accuracy, highlighting the effectiveness of our approach.</p><p><strong>Conclusion: </strong>These findings demonstrate that sentence embeddings and z-score normalization can substantially enhance the generalization performance of zero-shot learning models. However, as the study is based on a single dataset, future work should validate the method across diverse datasets and application domains to establish its robustness and broader applicability.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-stream MeshCNN for key anatomical segmentation on the liver surface.","authors":"Xukun Zhang, Sharib Ali, Minghao Han, Yanlan Kang, Xiaoying Wang, Lihua Zhang","doi":"10.1007/s11548-025-03358-5","DOIUrl":"https://doi.org/10.1007/s11548-025-03358-5","url":null,"abstract":"<p><strong>Purpose: </strong>Accurate preoperative segmentation of key anatomical regions on the liver surface is essential for enabling intraoperative navigation and position monitoring. However, current automatic segmentation methods face challenges due to the liver's drastic shape variations and limited data availability. This study aims to develop a two-stream mesh convolutional network (TSMCN) that integrates both global geometric and local topological information to achieve accurate, automatic segmentation of key anatomical regions.</p><p><strong>Methods: </strong>We propose TSMCN, which consists of two parallel streams: the E-stream focuses on extracting topological information from liver mesh edges, while the P-stream captures spatial relationships from coordinate points. These single-perspective features are adaptively fused through a fine-grained aggregation (FGA)-based attention mechanism, generating a robust pooled mesh that preserves task-relevant edges and topological structures. This fusion enhances the model's understanding of the liver mesh and facilitates discriminative feature extraction on the newly pooled mesh.</p><p><strong>Results: </strong>TSMCN was evaluated on 200 manually annotated 3D liver mesh datasets. It outperformed point-based (PointNet++) and edge feature-based (MeshCNN) methods, achieving superior segmentation results on the liver ridge and falciform ligament. The model significantly reduced the 3D Chamfer distance compared to other methods, with particularly strong performance in falciform ligament segmentation.</p><p><strong>Conclusion: </strong>TSMCN provides an effective approach to liver surface segmentation by integrating complementary geometric features. Its superior performance highlights the potential to enhance AR-guided liver surgery through automatic and precise preoperative segmentation of critical anatomical regions.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":" ","pages":""},"PeriodicalIF":2.3,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144267866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}