{"title":"Improving lung nodule segmentation in thoracic CT scans through the ensemble of 3D U-Net models.","authors":"Himanshu Rikhari, Esha Baidya Kayal, Shuvadeep Ganguly, Archana Sasi, Swetambri Sharma, Ajith Antony, Krithika Rangarajan, Sameer Bakhshi, Devasenathipathy Kandasamy, Amit Mehndiratta","doi":"10.1007/s11548-024-03222-y","DOIUrl":"10.1007/s11548-024-03222-y","url":null,"abstract":"<p><strong>Purpose: </strong>The current study explores the application of 3D U-Net architectures combined with Inception and ResNet modules for precise lung nodule detection through deep learning-based segmentation technique. This investigation is motivated by the objective of developing a Computer-Aided Diagnosis (CAD) system for effective diagnosis and prognostication of lung nodules in clinical settings.</p><p><strong>Methods: </strong>The proposed method trained four different 3D U-Net models on the retrospective dataset obtained from AIIMS Delhi. To augment the training dataset, affine transformations and intensity transforms were utilized. Preprocessing steps included CT scan voxel resampling, intensity normalization, and lung parenchyma segmentation. Model optimization utilized a hybrid loss function that combined Dice Loss and Focal Loss. The model performance of all four 3D U-Nets was evaluated patient-wise using dice coefficient and Jaccard coefficient, then averaged to obtain the average volumetric dice coefficient (DSC<sub>avg</sub>) and average Jaccard coefficient (IoU<sub>avg</sub>) on a test dataset comprising 53 CT scans. Additionally, an ensemble approach (Model-V) was utilized featuring 3D U-Net (Model-I), ResNet (Model-II), and Inception (Model-III) 3D U-Net architectures, combined with two distinct patch sizes for further investigation.</p><p><strong>Results: </strong>The ensemble of models obtained the highest DSC<sub>avg</sub> of 0.84 ± 0.05 and IoU<sub>avg</sub> of 0.74 ± 0.06 on the test dataset, compared against individual models. It mitigated false positives, overestimations, and underestimations observed in individual U-Net models. Moreover, the ensemble of models reduced average false positives per scan in the test dataset (1.57 nodules/scan) compared to individual models (2.69-3.39 nodules/scan).</p><p><strong>Conclusions: </strong>The suggested ensemble approach presents a strong and effective strategy for automatically detecting and delineating lung nodules, potentially aiding CAD systems in clinical settings. This approach could assist radiologists in laborious and meticulous lung nodule detection tasks in CT scans, improving lung cancer diagnosis and treatment planning.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Franziska Jurosch, Lars Wagner, Alissa Jell, Esra Islertas, Dirk Wilhelm, Maximilian Berlet
{"title":"Extra-abdominal trocar and instrument detection for enhanced surgical workflow understanding.","authors":"Franziska Jurosch, Lars Wagner, Alissa Jell, Esra Islertas, Dirk Wilhelm, Maximilian Berlet","doi":"10.1007/s11548-024-03220-0","DOIUrl":"10.1007/s11548-024-03220-0","url":null,"abstract":"<p><strong>Purpose: </strong>Video-based intra-abdominal instrument tracking for laparoscopic surgeries is a common research area. However, the tracking can only be done with instruments that are actually visible in the laparoscopic image. By using extra-abdominal cameras to detect trocars and classify their occupancy state, additional information about the instrument location, whether an instrument is still in the abdomen or not, can be obtained. This can enhance laparoscopic workflow understanding and enrich already existing intra-abdominal solutions.</p><p><strong>Methods: </strong>A data set of four laparoscopic surgeries recorded with two time-synchronized extra-abdominal 2D cameras was generated. The preprocessed and annotated data were used to train a deep learning-based network architecture consisting of a trocar detection, a centroid tracker and a temporal model to provide the occupancy state of all trocars during the surgery.</p><p><strong>Results: </strong>The trocar detection model achieves an F1 score of <math><mrow><mn>95.06</mn> <mo>±</mo> <mn>0.88</mn> <mo>%</mo></mrow> </math> . The prediction of the occupancy state yields an F1 score of <math><mrow><mn>89.29</mn> <mo>±</mo> <mn>5.29</mn> <mo>%</mo></mrow> </math> , providing a first step towards enhanced surgical workflow understanding.</p><p><strong>Conclusion: </strong>The current method shows promising results for the extra-abdominal tracking of trocars and their occupancy state. Future advancements include the enlargement of the data set and incorporation of intra-abdominal imaging to facilitate accurate assignment of instruments to trocars.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442558/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kochai Jan Jawed, Ian Buchanan, Kevin Cleary, Elizabeth Fischer, Aaron Mun, Nishanth Gowda, Arhum Naeem, Recai Yilmaz, Daniel A Donoho
{"title":"A microdiscectomy surgical video annotation framework for supervised machine learning applications.","authors":"Kochai Jan Jawed, Ian Buchanan, Kevin Cleary, Elizabeth Fischer, Aaron Mun, Nishanth Gowda, Arhum Naeem, Recai Yilmaz, Daniel A Donoho","doi":"10.1007/s11548-024-03203-1","DOIUrl":"10.1007/s11548-024-03203-1","url":null,"abstract":"<p><strong>Purpose: </strong>Lumbar discectomy is among the most common spine procedures in the US, with 300,000 procedures performed each year. Like other surgical procedures, this procedure is not excluded from potential complications. This paper presents a video annotation methodology for microdiscectomy including the development of a surgical workflow. In future work, this methodology could be combined with computer vision and machine learning models to predict potential adverse events. These systems would monitor the intraoperative activities and possibly anticipate the outcomes.</p><p><strong>Methods: </strong>A necessary step in supervised machine learning methods is video annotation, which involves labeling objects frame-by-frame to make them recognizable for machine learning applications. Microdiscectomy video recordings of spine surgeries were collected from a multi-center research collaborative. These videos were anonymized and stored in a cloud-based platform. Videos were uploaded to an online annotation platform. An annotation framework was developed based on literature review and surgical observations to ensure proper understanding of the instruments, anatomy, and steps.</p><p><strong>Results: </strong>An annotated video of microdiscectomy was produced by a single surgeon. Multiple iterations allowed for the creation of an annotated video complete with labeled surgical tools, anatomy, and phases. In addition, a workflow was developed for the training of novice annotators, which provides information about the annotation software to assist in the production of standardized annotations.</p><p><strong>Conclusions: </strong>A standardized workflow for managing surgical video data is essential for surgical video annotation and machine learning applications. We developed a standard workflow for annotating surgical videos for microdiscectomy that may facilitate the quantitative analysis of videos using supervised machine learning applications. Future work will demonstrate the clinical relevance and impact of this workflow by developing process modeling and outcome predictors.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141724987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sumin Jung, Hyun Yang, Hyun Jeong Kim, Hong Gee Roh, Jin Tae Kwak
{"title":"3D mobile regression vision transformer for collateral imaging in acute ischemic stroke.","authors":"Sumin Jung, Hyun Yang, Hyun Jeong Kim, Hong Gee Roh, Jin Tae Kwak","doi":"10.1007/s11548-024-03229-5","DOIUrl":"10.1007/s11548-024-03229-5","url":null,"abstract":"<p><strong>Purpose: </strong>The accurate and timely assessment of the collateral perfusion status is crucial in the diagnosis and treatment of patients with acute ischemic stroke. Previous works have shown that collateral imaging, derived from CT angiography, MR perfusion, and MR angiography, aids in evaluating the collateral status. However, such methods are time-consuming and/or sub-optimal due to the nature of manual processing and heuristics. Recently, deep learning approaches have shown to be promising for generating collateral imaging. These, however, suffer from the computational complexity and cost.</p><p><strong>Methods: </strong>In this study, we propose a mobile, lightweight deep regression neural network for collateral imaging in acute ischemic stroke, leveraging dynamic susceptibility contrast MR perfusion (DSC-MRP). Built based upon lightweight convolution and Transformer architectures, the proposed model manages the balance between the model complexity and performance.</p><p><strong>Results: </strong>We evaluated the performance of the proposed model in generating the five-phase collateral maps, including arterial, capillary, early venous, late venous, and delayed phases, using DSC-MRP from 952 patients. In comparison with various deep learning models, the proposed method was superior to the competitors with similar complexity and was comparable to the competitors of high complexity.</p><p><strong>Conclusion: </strong>The results suggest that the proposed model is able to facilitate rapid and precise assessment of the collateral status of patients with acute ischemic stroke, leading to improved patient care and outcome.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442547/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection of pulmonary nodules in chest radiographs: novel cost function for effective network training with purely synthesized datasets.","authors":"Shouhei Hanaoka, Yukihiro Nomura, Takeharu Yoshikawa, Takahiro Nakao, Tomomi Takenaga, Hirotaka Matsuzaki, Nobutake Yamamichi, Osamu Abe","doi":"10.1007/s11548-024-03227-7","DOIUrl":"10.1007/s11548-024-03227-7","url":null,"abstract":"<p><strong>Purpose: </strong>Many large radiographic datasets of lung nodules are available, but the small and hard-to-detect nodules are rarely validated by computed tomography. Such difficult nodules are crucial for training nodule detection methods. This lack of difficult nodules for training can be addressed by artificial nodule synthesis algorithms, which can create artificially embedded nodules. This study aimed to develop and evaluate a novel cost function for training networks to detect such lesions. Embedding artificial lesions in healthy medical images is effective when positive cases are insufficient for network training. Although this approach provides both positive (lesion-embedded) images and the corresponding negative (lesion-free) images, no known methods effectively use these pairs for training. This paper presents a novel cost function for segmentation-based detection networks when positive-negative pairs are available.</p><p><strong>Methods: </strong>Based on the classic U-Net, new terms were added to the original Dice loss for reducing false positives and the contrastive learning of diseased regions in the image pairs. The experimental network was trained and evaluated, respectively, on 131,072 fully synthesized pairs of images simulating lung cancer and real chest X-ray images from the Japanese Society of Radiological Technology dataset.</p><p><strong>Results: </strong>The proposed method outperformed RetinaNet and a single-shot multibox detector. The sensitivities were 0.688 and 0.507 when the number of false positives per image was 0.2, respectively, with and without fine-tuning under the leave-one-case-out setting.</p><p><strong>Conclusion: </strong>To our knowledge, this is the first study in which a method for detecting pulmonary nodules in chest X-ray images was evaluated on a real clinical dataset after being trained on fully synthesized images. The synthesized dataset is available at https://zenodo.org/records/10648433 .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-time prediction of postoperative spinal shape with machine learning models trained on finite element biomechanical simulations.","authors":"Renzo Phellan Aro, Bahe Hachem, Julien Clin, Jean-Marc Mac-Thiong, Luc Duong","doi":"10.1007/s11548-024-03237-5","DOIUrl":"10.1007/s11548-024-03237-5","url":null,"abstract":"<p><strong>Purpose: </strong>Adolescent idiopathic scoliosis is a chronic disease that may require correction surgery. The finite element method (FEM) is a popular option to plan the outcome of surgery on a patient-based model. However, it requires considerable computing power and time, which may discourage its use. Machine learning (ML) models can be a helpful surrogate to the FEM, providing accurate real-time responses. This work implements ML algorithms to estimate post-operative spinal shapes.</p><p><strong>Methods: </strong>The algorithms are trained using features from 6400 simulations generated using the FEM from spine geometries of 64 patients. The features are selected using an autoencoder and principal component analysis. The accuracy of the results is evaluated by calculating the root-mean-squared error and the angle between the reference and predicted position of each vertebra. The processing times are also reported.</p><p><strong>Results: </strong>A combination of principal component analysis for dimensionality reduction, followed by the linear regression model, generated accurate results in real-time, with an average position error of 3.75 mm and orientation angle error below 2.74 degrees in all main 3D axes, within 3 ms. The prediction time is considerably faster than simulations based on the FEM alone, which require seconds to minutes.</p><p><strong>Conclusion: </strong>It is possible to predict post-operative spinal shapes of patients with AIS in real-time by using ML algorithms as a surrogate to the FEM. Clinicians can compare the response of the initial spine shape of a patient with AIS to various target shapes, which can be modified interactively. These benefits can encourage clinicians to use software tools for surgical planning of scoliosis.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel contact optimization algorithm for endomicroscopic surface scanning.","authors":"Xingfeng Xu, Shengzhe Zhao, Lun Gong, Siyang Zuo","doi":"10.1007/s11548-024-03223-x","DOIUrl":"10.1007/s11548-024-03223-x","url":null,"abstract":"<p><strong>Purpose: </strong>Probe-based confocal laser endomicroscopy (pCLE) offers real-time, cell-level imaging and holds promise for early cancer diagnosis. However, a large area surface scanning for image acquisition is needed to overcome the limitation of field-of-view. Obtaining high-quality images during scanning requires maintaining a stable contact distance between the tissue and probe. This work presents a novel contact optimization algorithm to acquire high-quality pCLE images.</p><p><strong>Methods: </strong>The contact optimization algorithm, based on swarm intelligence of whale optimization algorithm, is designed to optimize the probe position, according to the quality of the image acquired by probe. An accurate image quality assessment of total co-occurrence entropy is introduced to evaluate the pCLE image quality. The algorithm aims to maintain a consistent probe-tissue contact, resulting in high-quality images acquisition.</p><p><strong>Results: </strong>Scanning experiments on sponge, ex vivo swine skin tissue and stomach tissue demonstrate the effectiveness of the contact optimization algorithm. Scanning results of the sponge with three different trajectories (spiral trajectory, circle trajectory, and raster trajectory) reveal high-quality mosaics with clear details in every part of the image and no blurred sections.</p><p><strong>Conclusion: </strong>The contact optimization algorithm successfully identifies the optimal distance between probe and tissue, improving the quality of pCLE images. Experimental results confirm the high potential of this method in endomicroscopic surface scanning.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141545439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sai Natarajan, Ludovic Humbert, Miguel A González Ballester
{"title":"Domain adaptation using AdaBN and AdaIN for high-resolution IVD mesh reconstruction from clinical MRI.","authors":"Sai Natarajan, Ludovic Humbert, Miguel A González Ballester","doi":"10.1007/s11548-024-03233-9","DOIUrl":"10.1007/s11548-024-03233-9","url":null,"abstract":"<p><strong>Purpose: </strong>Deep learning has firmly established its dominance in medical imaging applications. However, careful consideration must be exercised when transitioning a trained source model to adapt to an entirely distinct environment that deviates significantly from the training set. The majority of the efforts to mitigate this issue have predominantly focused on classification and segmentation tasks. In this work, we perform a domain adaptation of a trained source model to reconstruct high-resolution intervertebral disc meshes from low-resolution MRI.</p><p><strong>Methods: </strong>To address the outlined challenges, we use MRI2Mesh as the shape reconstruction network. It incorporates three major modules: image encoder, mesh deformation, and cross-level feature fusion. This feature fusion module is used to encapsulate local and global disc features. We evaluate two major domain adaptation techniques: adaptive batch normalization (AdaBN) and adaptive instance normalization (AdaIN) for the task of shape reconstruction.</p><p><strong>Results: </strong>Experiments conducted on distinct datasets, including data from different populations, machines, and test sites demonstrate the effectiveness of MRI2Mesh for domain adaptation. MRI2Mesh achieved up to a 14% decrease in Hausdorff distance (HD) and a 19% decrease in the point-to-surface (P2S) metric for both AdaBN and AdaIN experiments, indicating improved performance.</p><p><strong>Conclusion: </strong>MRI2Mesh has demonstrated consistent superiority to the state-of-the-art Voxel2Mesh network across a diverse range of datasets, populations, and scanning protocols, highlighting its versatility. Additionally, AdaBN has emerged as a robust method compared to the AdaIN technique. Further experiments show that MRI2Mesh, when combined with AdaBN, holds immense promise for enhancing the precision of anatomical shape reconstruction in domain adaptation.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141604462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debayan Bhattacharya, Konrad Reuter, Finn Behrendt, Lennart Maack, Sarah Grube, Alexander Schlaefer
{"title":"PolypNextLSTM: a lightweight and fast polyp video segmentation network using ConvNext and ConvLSTM.","authors":"Debayan Bhattacharya, Konrad Reuter, Finn Behrendt, Lennart Maack, Sarah Grube, Alexander Schlaefer","doi":"10.1007/s11548-024-03244-6","DOIUrl":"10.1007/s11548-024-03244-6","url":null,"abstract":"<p><strong>Purpose: </strong>Commonly employed in polyp segmentation, single-image UNet architectures lack the temporal insight clinicians gain from video data in diagnosing polyps. To mirror clinical practices more faithfully, our proposed solution, PolypNextLSTM, leverages video-based deep learning, harnessing temporal information for superior segmentation performance with least parameter overhead, making it possibly suitable for edge devices.</p><p><strong>Methods: </strong>PolypNextLSTM employs a UNet-like structure with ConvNext-Tiny as its backbone, strategically omitting the last two layers to reduce parameter overhead. Our temporal fusion module, a Convolutional Long Short Term Memory (ConvLSTM), effectively exploits temporal features. Our primary novelty lies in PolypNextLSTM, which stands out as the leanest in parameters and the fastest model, surpassing the performance of five state-of-the-art image and video-based deep learning models. The evaluation of the SUN-SEG dataset spans easy-to-detect and hard-to-detect polyp scenarios, along with videos containing challenging artefacts like fast motion and occlusion.</p><p><strong>Results: </strong>Comparison against 5 image-based and 5 video-based models demonstrates PolypNextLSTM's superiority, achieving a Dice score of 0.7898 on the hard-to-detect polyp test set, surpassing image-based PraNet (0.7519) and video-based PNS+ (0.7486). Notably, our model excels in videos featuring complex artefacts such as ghosting and occlusion.</p><p><strong>Conclusion: </strong>PolypNextLSTM, integrating pruned ConvNext-Tiny with ConvLSTM for temporal fusion, not only exhibits superior segmentation performance but also maintains the highest frames per speed among evaluated models. Code can be found here: https://github.com/mtec-tuhh/PolypNextLSTM .</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141903519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lars Wagner, Dennis N Schneider, Leon Mayer, Alissa Jell, Carolin Müller, Alexander Lenz, Alois Knoll, Dirk Wilhelm
{"title":"Towards multimodal graph neural networks for surgical instrument anticipation.","authors":"Lars Wagner, Dennis N Schneider, Leon Mayer, Alissa Jell, Carolin Müller, Alexander Lenz, Alois Knoll, Dirk Wilhelm","doi":"10.1007/s11548-024-03226-8","DOIUrl":"10.1007/s11548-024-03226-8","url":null,"abstract":"<p><strong>Purpose: </strong>Decision support systems and context-aware assistance in the operating room have emerged as the key clinical applications supporting surgeons in their daily work and are generally based on single modalities. The model- and knowledge-based integration of multimodal data as a basis for decision support systems that can dynamically adapt to the surgical workflow has not yet been established. Therefore, we propose a knowledge-enhanced method for fusing multimodal data for anticipation tasks.</p><p><strong>Methods: </strong>We developed a holistic, multimodal graph-based approach combining imaging and non-imaging information in a knowledge graph representing the intraoperative scene of a surgery. Node and edge features of the knowledge graph are extracted from suitable data sources in the operating room using machine learning. A spatiotemporal graph neural network architecture subsequently allows for interpretation of relational and temporal patterns within the knowledge graph. We apply our approach to the downstream task of instrument anticipation while presenting a suitable modeling and evaluation strategy for this task.</p><p><strong>Results: </strong>Our approach achieves an F1 score of 66.86% in terms of instrument anticipation, allowing for a seamless surgical workflow and adding a valuable impact for surgical decision support systems. A resting recall of 63.33% indicates the non-prematurity of the anticipations.</p><p><strong>Conclusion: </strong>This work shows how multimodal data can be combined with the topological properties of an operating room in a graph-based approach. Our multimodal graph architecture serves as a basis for context-sensitive decision support systems in laparoscopic surgery considering a comprehensive intraoperative operating scene.</p>","PeriodicalId":51251,"journal":{"name":"International Journal of Computer Assisted Radiology and Surgery","volume":null,"pages":null},"PeriodicalIF":2.3,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11442600/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}