Benjamin S Hopkins, Garnette R Sutherland, Samuel R Browd, Daniel A Donoho, Eric K Oermann, Clemens M Schirmer, Brenton Pennicooke, Wael F Asaad
{"title":"Introduction. Artificial intelligence in neurosurgery: transforming a data-intensive specialty.","authors":"Benjamin S Hopkins, Garnette R Sutherland, Samuel R Browd, Daniel A Donoho, Eric K Oermann, Clemens M Schirmer, Brenton Pennicooke, Wael F Asaad","doi":"10.3171/2025.4.FOCUS24674","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS24674","url":null,"abstract":"","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E1"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Phenotype-driven risk stratification of cerebral aneurysms using Shapley Additive Explanations-based supervised clustering: a novel approach to rupture prediction.","authors":"Shrinit Babel, Syed R H Peeran","doi":"10.3171/2025.4.FOCUS241024","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS241024","url":null,"abstract":"<p><strong>Objective: </strong>The aim of this study was to address the limitations of traditional aneurysm risk scoring systems and computational fluid dynamics (CFD) analyses by applying a supervised clustering framework to identify distinct aneurysm phenotypes and improve rupture risk prediction.</p><p><strong>Methods: </strong>Geometric and morphological data for 103 cerebral aneurysms were obtained from the AneuriskWeb dataset. To segment the cerebral aneurysm data into information-dense clusters that relate to aneurysm rupture risk, the authors trained an Extreme Gradient Boosting model for Shapley Additive Explanations (SHAP)-based feature attribution followed by nonlinear dimensionality reduction. Hierarchical Density-based Spatial Clustering of Applications with Noise (HDBSCAN) was then used on the SHAP-transformed feature space to identify clusters that were, subsequently, interpreted directly using rule-based machine learning and indirectly with phenotype visualization.</p><p><strong>Results: </strong>The initial SHAP analysis identified the parent vessel diameter, neck vessel angle, and the cross-sectional area along the centerline of the sac as the most significant predictors of rupture risk. Clustering revealed three distinct aneurysm phenotypes with a high degree of separation (Silhouette score = 0.915). Cluster α, characterized by parent vessel diameters > 3.08 mm and elongated geometries, was a low-risk phenotype with a 4.16% rupture rate. Cluster β only included ruptured aneurysms, with vessel diameters ≤ 1.65 mm and nonspherical structures. Cluster γ represented a mixed-risk aneurysm phenotype (rupture rate of 45.45%), with intermedial vessel diameters (range 1.65-3.08 mm); acute neck angles (< 90°) increased the rupture rate within this cluster.</p><p><strong>Conclusions: </strong>The supervised clustering identified distinct cerebral aneurysm phenotypes, balancing granularity with interpretability in CFD data analysis. Future studies should build on these phenotype-driven insights with temporal analyses and larger datasets for validation, as well as an end-to-end framework to enhance scalability.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E3"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michael Murek, Markus Philipp, Marielena Gutt-Will, Andrea Maria Mathis, David Bervini, Stefan Saur, Franziska Mathis-Ullrich, Andreas Raabe
{"title":"Streamlining microsurgical procedures: a phantom trial of an artificial intelligence-driven robotic microscope assistant.","authors":"Michael Murek, Markus Philipp, Marielena Gutt-Will, Andrea Maria Mathis, David Bervini, Stefan Saur, Franziska Mathis-Ullrich, Andreas Raabe","doi":"10.3171/2025.4.FOCUS25142","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS25142","url":null,"abstract":"<p><strong>Objective: </strong>Surgical microscopes are essential in microsurgery for magnification, focus, and illumination. However, surgeons must frequently adjust the microscope manually-typically via a handgrip or mouth switch-to maintain a well-centered view that ensures clear visibility of the operative field and surrounding anatomy. These frequent adjustments can disrupt surgical workflow, increase cognitive load, and divert surgeons' focus from their surgical task. To address these challenges, the authors introduced and evaluated a novel robotic assistance system that leverages AI to automatically detect the surgical area of interest by localizing surgical instrument tips and robotically recentering the microscope's field of view.</p><p><strong>Methods: </strong>This preclinical user study with 19 neurosurgeons compared the robotic assistance system with state-of-the-art microscope controls, i.e., a handgrip and mouth switch. Participants engaged in a custom-designed microsurgical scenario involving a phantom-based anastomosis requiring frequent microscope adjustments. Task load related to microscope handling was assessed using the National Aeronautics and Space Administration Task Load Index questionnaire, and efficiency and workflow compatibility were analyzed based on suturing time and interruption frequency. To evaluate the effectiveness of the robotic assistance system in maintaining a centered view, heat maps that visualize the areas where surgeons operated with their instrument tips were computed.</p><p><strong>Results: </strong>The robotic assistance system significantly reduced microscope-associated task load compared to the handgrip, decreasing physical (r = 0.59, p < 0.001) and temporal (r = 0.49, p = 0.022) demand while enhancing microscope handling performance (r = 0.40, p = 0.003). In comparison to the mouth switch, reductions in physical (r = 0.45, p = 0.002) and mental (r = 0.32, p = 0.031) demand were observed, alongside performance improvements (r = 0.41, p = 0.008). Furthermore, robotic assistance increased effective suturing time by approximately 10% (r = 0.90, p < 0.001), reduced interruptions (r = 0.52, p = 0.035), and enabled faster reaction times when readjusting the microscope (r = 0.68, p = 0.005) in contrast to the handgrip. According to the heat map analysis, the robotic assistance system consistently promoted a more centered microscope view compared with manual controls.</p><p><strong>Conclusions: </strong>The novel robotic assistance system enhances microsurgical efficiency by AI-assisted microscope adjustments, thereby reducing task load and streamlining workflow. Compared to manual microscope control, automating microscope adjustments minimizes distractions and task switching, allowing surgeons to maintain a consistently centered view of the operative field. Future studies should focus on clinical validation in live surgeries.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E2"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Massimo Bottini, Olivier Zanier, Raffaele Da Mutten, Maria L Gandia-Gonzalez, Erik Edström, Adrian Elmi-Terander, Luca Regli, Carlo Serra, Victor E Staartjes
{"title":"Generation of synthetic CT-like imaging of the spine from biplanar radiographs: comparison of different deep learning architectures.","authors":"Massimo Bottini, Olivier Zanier, Raffaele Da Mutten, Maria L Gandia-Gonzalez, Erik Edström, Adrian Elmi-Terander, Luca Regli, Carlo Serra, Victor E Staartjes","doi":"10.3171/2025.4.FOCUS25170","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS25170","url":null,"abstract":"<p><strong>Objective: </strong>This study compared two deep learning architectures-generative adversarial networks (GANs) and convolutional neural networks combined with implicit neural representations (CNN-INRs)-for generating synthetic CT (sCT) images of the spine from biplanar radiographs. The aim of the study was to identify the most robust and clinically viable approach for this potential intraoperative imaging technique.</p><p><strong>Methods: </strong>A spine CT dataset of 216 training and 54 validation cases was used. Digitally reconstructed radiographs (DRRs) served as 2D inputs for training both models under identical conditions for 170 epochs. Evaluation metrics included the Structural Similarity Index Measure (SSIM), peak signal-to-noise ratio (PSNR), and cosine similarity (CS), complemented by qualitative assessments of anatomical fidelity.</p><p><strong>Results: </strong>The GAN model achieved a mean SSIM of 0.932 ± 0.015, PSNR of 19.85 ± 1.40 dB, and CS of 0.671 ± 0.177. The CNN-INR model demonstrated a mean SSIM of 0.921 ± 0.015, PSNR of 21.96 ± 1.20 dB, and CS of 0.707 ± 0.114. Statistical analysis revealed significant differences for SSIM (p = 0.001) and PSNR (p < 0.001), while CS differences were not statistically significant (p = 0.667). Qualitative evaluations consistently favored the GAN model, which produced more anatomically detailed and visually realistic sCT images.</p><p><strong>Conclusions: </strong>This study demonstrated the feasibility of generating spine sCT images from biplanar radiographs using GAN and CNN-INR models. While neither model achieved clinical-grade outputs, the GAN architecture showed greater potential for generating anatomically accurate and visually realistic images. These findings highlight the promise of sCT image generation from biplanar radiographs as an innovative approach to reducing radiation exposure and improving imaging accessibility, with GANs emerging as the more promising avenue for further research and clinical integration.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E13"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaolin Hou, Xiaoling Liao, Ruxiang Xu, Fan Fei, Bo Wu
{"title":"Open-source AI-assisted rapid 3D color multimodal image fusion and preoperative augmented reality planning of extracerebral tumors.","authors":"Xiaolin Hou, Xiaoling Liao, Ruxiang Xu, Fan Fei, Bo Wu","doi":"10.3171/2025.4.FOCUS24557","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS24557","url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to develop an advanced method for preoperative planning and surgical guidance using open-source artificial intelligence (AI)-assisted rapid 3D color multimodal image fusion (MIF) and augmented reality (AR) in extracerebral tumor surgical procedures.</p><p><strong>Methods: </strong>In this prospective trial of 130 patients with extracerebral tumors, the authors implemented a novel workflow combining FastSurfer (AI-based brain parcellation), Raidionics-Slicer (deep learning tumor segmentation), and Sina AR projection. Comparative analysis between AI-assisted 3D-color MIF (group A) and manual-3D-monochrome MIF (group B) was conducted, evaluating surgical parameters (operative time, blood loss, resection completeness), clinical outcomes (complications, hospital stay, modified Rankin Scale [mRS] scores), and technical performance metrics (processing time, Dice similarity coefficient [DSC], 95% Hausdorff distance [HD]).</p><p><strong>Results: </strong>The AI-3D-color MIF system achieved superior technical performance with brain segmentation in 1.21 ± 0.13 minutes (vs 4.51 ± 0.15 minutes for manual segmentation), demonstrating exceptional accuracy (DSC 0.978 ± 0.012 vs 0.932 ± 0.029; 95% HD 1.51 ± 0.23 mm vs 3.52 ± 0.35 mm). Clinically, group A demonstrated significant advantages with shorter operative duration, reduced intraoperative blood loss, higher rate of gross-total resection, lower complication incidence, and better postoperative mRS scores (all p < 0.05).</p><p><strong>Conclusions: </strong>The integration of open-source AI tools (FastSurfer/Raidionics) with AR visualization creates an efficient 3D-color MIF workflow that enhances anatomical understanding through color-coded functional mapping and vascular relationship visualization. This system significantly improves surgical precision while reducing perioperative risks, representing a cost-effective solution for advanced neurosurgical planning in resource-constrained settings.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E12"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rushmin Khazanchi, Sachin Govind, Rishi Jain, Rebecca Du, Nader S Dahdaleh, Christopher S Ahuja, Najib El Tecle
{"title":"Zero-shot segmentation of spinal vertebrae with metastatic lesions: an analysis of Meta's Segment Anything Model 2 and factors affecting learning free segmentation.","authors":"Rushmin Khazanchi, Sachin Govind, Rishi Jain, Rebecca Du, Nader S Dahdaleh, Christopher S Ahuja, Najib El Tecle","doi":"10.3171/2025.4.FOCUS25234","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS25234","url":null,"abstract":"<p><strong>Objective: </strong>Accurate vertebral segmentation is an important step in imaging analysis pipelines for diagnosis and subsequent treatment of spinal metastases. Segmenting these metastases is especially challenging given their radiological heterogeneity. Conventional approaches for segmenting vertebrae have included manual review or deep learning; however, manual review is time-intensive with interrater reliability issues, while deep learning requires large datasets to build. The rise of generative AI, notably tools such as Meta's Segment Anything Model 2 (SAM 2), holds promise in its ability to rapidly generate segmentations of any image without pretraining (zero-shot). The authors of this study aimed to assess the ability of SAM 2 to segment vertebrae with metastases.</p><p><strong>Methods: </strong>A publicly available set of spinal CT scans from The Cancer Imaging Archive was used, which included patient sex, BMI, vertebral locations, types of metastatic lesion (lytic, blastic, or mixed), and primary cancer type. Ground-truth segmentations for each vertebra, derived by neuroradiologists, were further extracted from the dataset. SAM 2 then produced segmentations for each vertebral slice without any training data, all of which were compared to gold standard segmentations using the Dice similarity coefficient (DSC). Relative performance differences were assessed across clinical subgroups using standard statistical techniques.</p><p><strong>Results: </strong>Imaging data were extracted for 55 patients and 779 unique thoracolumbar vertebrae, 167 of which had metastatic tumor involvement. Across these vertebrae, SAM 2 had a mean volumetric DSC of 0.833 ± 0.053. SAM 2 performed significantly worse on thoracic vertebrae relative to lumbar vertebrae, female patients relative to male patients, and obese patients relative to non-obese patients.</p><p><strong>Conclusions: </strong>These results demonstrate that general-purpose segmentation models like SAM 2 can provide reasonable vertebral segmentation accuracy with no pretraining, with efficacy comparable to previously published trained models. Future research should include optimizations of spine segmentation models for vertebral location and patient body habitus, as well as for variations in imaging quality approaches.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E18"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eunice Yang, Harrison Howell, Praveen V Mummaneni, Dean Chou, Mohamad Bydon, Erica F Bisson, Christopher I Shaffrey, Oren N Gottfried, Anthony L Asher, Domagoj Coric, Eric A Potts, Kevin T Foley, Michael Y Wang, Kai-Ming Fu, Michael S Virk, John J Knightly, Scott Meyer, Paul Park, Cheerag D Upadhyaya, Chun-Po Yen, Juan S Uribe, Luis M Tumialán, Jay D Turner, Regis W Haid, Andrew K Chan
{"title":"Defining cervical spondylotic myelopathy surgical endotypes using comorbidity clustering: a Quality Outcomes Database cervical spondylotic myelopathy study.","authors":"Eunice Yang, Harrison Howell, Praveen V Mummaneni, Dean Chou, Mohamad Bydon, Erica F Bisson, Christopher I Shaffrey, Oren N Gottfried, Anthony L Asher, Domagoj Coric, Eric A Potts, Kevin T Foley, Michael Y Wang, Kai-Ming Fu, Michael S Virk, John J Knightly, Scott Meyer, Paul Park, Cheerag D Upadhyaya, Chun-Po Yen, Juan S Uribe, Luis M Tumialán, Jay D Turner, Regis W Haid, Andrew K Chan","doi":"10.3171/2025.4.FOCUS25207","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS25207","url":null,"abstract":"<p><strong>Objective: </strong>Coexisting medical conditions are increasingly prevalent in surgical populations. The impact of multiple comorbidities on patient-reported outcomes (PROs) and endotypes of frequently co-occurring conditions for cervical spondylotic myelopathy (CSM) remain unclear. This study explores whether CSM patients with multimorbidity have worse baseline and postoperative PROs and less functional improvement after surgery compared to those with few or no comorbidities. The authors also investigated whether distinct comorbidity endotypes exist among CSM surgical patients and whether they influence postoperative outcomes.</p><p><strong>Methods: </strong>The prospective Quality Outcomes Database (QOD) was used to assess patients undergoing surgery for CSM. Multimorbidity was defined as ≥ 2 chronic conditions, including diabetes, coronary artery disease, peripheral vascular disease, arthritis, chronic renal disease, chronic obstructive pulmonary disease, Parkinson's disease, multiple sclerosis, depression, and anxiety. Baseline characteristics and 24-month PROs were assessed across multiple-comorbidity status, including modified Japanese Orthopaedic Association (mJOA), Neck Disability Index (NDI), visual analog scale for neck and arm pain, EQ-5D, and patient satisfaction scores. Clusters were identified from the full cohort using k-medoids, revealing subgroups with similar comorbidity endotypes.</p><p><strong>Results: </strong>The final cohort included 1141 CSM patients (83.1% reaching 24-month follow-up), with 761 (66.7%) having 0 or 1 comorbidity and 380 (33.3%) ≥ 2 comorbidities. The multimorbidity cohort was older (mean age 62.6 ± 11.2 vs 59.5 ± 12.0 years, p < 0.001), more likely to be female (52.9% vs 44.7%, p = 0.011), and had a higher BMI (mean 31.1 ± 6.7 vs 29.7 ± 6.2 kg/m2, p < 0.001). Multimorbidity patients exhibited worse mJOA, NDI, and EQ-5D scores at baseline and 24 months (p < 0.05). On multivariable analysis, the total number of comorbidities was not significantly associated with any PRO measures. Four comorbidity clusters were identified: low burden, arthritis, diabetes, and high burden. On one-way ANOVA, the baseline mJOA score was significantly different across clusters (p = 0.003). At 24 months, the mJOA score was significantly lower in the diabetes and high-burden endotypes. Twenty-four-month score change and minimal clinically important difference (MCID) achievement of all PROs remained similar across clusters (p > 0.05).</p><p><strong>Conclusions: </strong>While patients with multimorbidity have worse baseline and postoperative PROs, they achieve similar functional and pain-related improvements following CSM surgery. Similarly, the comorbidity endotypes identified in this QOD cohort suggest that certain patterns of coexisting chronic conditions, such as overlapping diabetes and arthritis, are associated with different levels of disability but may not diminish the effectiveness of surgical intervention.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E4"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuki Shinya, Abdul Karim Ghaith, Sukwoo Hong, Justine S Herndon, Sandhya R Palit, Dana Erickson, Irina Bancos, Miguel Saez-Alegre, Ramin A Morshed, Carlos Pinheiro Neto, Fredric B Meyer, John L D Atkinson, Jamie J Van Gompel
{"title":"A supervised machine learning approach for predicting the need for postsurgical intervention in acromegaly.","authors":"Yuki Shinya, Abdul Karim Ghaith, Sukwoo Hong, Justine S Herndon, Sandhya R Palit, Dana Erickson, Irina Bancos, Miguel Saez-Alegre, Ramin A Morshed, Carlos Pinheiro Neto, Fredric B Meyer, John L D Atkinson, Jamie J Van Gompel","doi":"10.3171/2025.4.FOCUS2597","DOIUrl":"10.3171/2025.4.FOCUS2597","url":null,"abstract":"<p><strong>Objective: </strong>Patients with growth hormone (GH)-secreting pituitary adenomas (PAs) experience various symptoms and comorbidities, which can ultimately lead to increased mortality. This study aimed to develop and validate a machine learning (ML) model for predicting long-term outcomes in patients with GH-secreting PAs following endonasal transsphenoidal surgery (ETS).</p><p><strong>Methods: </strong>The authors conducted a retrospective three-institution cohort study that included patients with GH-secreting PAs treated with ETS between 2013 and 2023. Clinical, radiological, and biochemical data were collected. The main outcome of interest was the intervention-free rate (IFR) after primary ETS. Supervised ML algorithms, including decision trees and random forests, were developed to predict the IFR. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC) and Shapley Additive Explanations (SHAP) values.</p><p><strong>Results: </strong>The median follow-up for 100 patients with GH-secreting PAs (53% female) was 64 months (range 1-130 months). Additional intervention for persistent or recurrent acromegaly was required in 32% of patients. Following primary ETS alone, the 3-year IFR was 70% and the 5-year IFR was 67%. Multiple ML models were developed and evaluated using AUROCs. The decision tree analysis achieved an accuracy of 81% and emphasized the importance of both gross-total resection (GTR) and patient age in determining the long-term IFR. To better understand the factors that contributed to model performance, SHAP analysis was applied to the best-performing model. The SHAP dependence plots showed that key factors associated with a longer IFR included tumor size < 9 mm, GTR, patient age > 65 years, and Knosp grade 0.</p><p><strong>Conclusions: </strong>This ML model offers a more nuanced and potentially more accurate approach to identify patients more likely to develop recurrent or persistent acromegaly following primary ETS and require additional treatment. Following external validation, this ML model could improve personalized treatment planning and follow-up strategies and enhance patient care and resource allocation in clinical practice.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E10"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Austin A Barr, Eddie Guo, Brij S Karmur, Emre Sezgin
{"title":"Synthetic neurosurgical data generation with generative adversarial networks and large language models:an investigation on fidelity, utility, and privacy.","authors":"Austin A Barr, Eddie Guo, Brij S Karmur, Emre Sezgin","doi":"10.3171/2025.4.FOCUS25225","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS25225","url":null,"abstract":"<p><strong>Objective: </strong>Use of neurosurgical data for clinical research and machine learning (ML) model development is often limited by data availability, sample sizes, and regulatory constraints. Synthetic data offer a potential solution to challenges associated with accessing, sharing, and using real-world data (RWD). The aim of this study was to evaluate the capability of generating synthetic neurosurgical data with a generative adversarial network and large language model (LLM) to augment RWD, perform secondary analyses in place of RWD, and train an ML model to predict postoperative outcomes.</p><p><strong>Methods: </strong>Synthetic data were generated with a conditional tabular generative adversarial network (CTGAN) and the LLM GPT-4o based on a real-world neurosurgical dataset of 140 older adults who underwent neurosurgical interventions. Each model was used to generate datasets at equivalent (n = 140) and amplified (n = 1000) sample sizes. Data fidelity was evaluated by comparing univariate and bivariate statistics to the RWD. Privacy evaluation involved measuring the uniqueness of generated synthetic records. Utility was assessed by: 1) reproducing and extending clinical analyses on predictors of Karnofsky Performance Status (KPS) deterioration at discharge and a prolonged postoperative intensive care unit (ICU) stay, and 2) training a binary ML classifier on amplified synthetic datasets to predict KPS deterioration on RWD.</p><p><strong>Results: </strong>Both the CTGAN and GPT-4o generated complete, high-fidelity synthetic tabular datasets. GPT-4o matched or exceeded CTGAN across all measured fidelity, utility, and privacy metrics. All significant clinical predictors of KPS deterioration and prolonged ICU stay were retained in the GPT-4o-generated synthetic data, with some differences observed in effect sizes. Preoperative KPS was not preserved as a significant predictor in the CTGAN-generated data. The ML classifier trained on GPT-4o data outperformed the model trained on CTGAN data, achieving a higher F1 score (0.725 vs 0.688) for predicting KPS deterioration.</p><p><strong>Conclusions: </strong>This study demonstrated a promising ability to produce high-fidelity synthetic neurosurgical data using generative models. Synthetic neurosurgical data present a potential solution to critical limitations in data availability for neurosurgical research. Further investigation is necessary to enhance synthetic data utility for secondary analyses and ML model training, and to evaluate synthetic data generation methods across other datasets, including clinical trial data.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E17"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Benjamin S Hopkins, Jonathan Dallas, James Yu, Robert G Briggs, Lawrance K Chung, David J Cote, David Gomez, Ishan Shah, John D Carmichael, John C Liu, William J Mack, Gabriel Zada
{"title":"The use of generative artificial intelligence-based dictation in a neurosurgical practice: a pilot study.","authors":"Benjamin S Hopkins, Jonathan Dallas, James Yu, Robert G Briggs, Lawrance K Chung, David J Cote, David Gomez, Ishan Shah, John D Carmichael, John C Liu, William J Mack, Gabriel Zada","doi":"10.3171/2025.4.FOCUS24834","DOIUrl":"https://doi.org/10.3171/2025.4.FOCUS24834","url":null,"abstract":"<p><strong>Objective: </strong>Document dictation remains a significant clinical burden and generative artificial intelligence (AI) systems utilizing transformer-based technology offer efficient speech processing methods that could streamline clinical documentation. This study aimed to evaluate the potential of generative AI in enhancing dictation efficiency and workflow within a targeted neurosurgical practice.</p><p><strong>Methods: </strong>Ten operative reports from both cranial and spinal neurosurgical procedures were dictated and recorded by three independent physicians. The audio files were processed by 1) a modified speech-to-text model implemented based on a backbone architecture created by OpenAI's Whisper model and 2) Nuance's Dragon Medical One as a comparative commercial standard. Word error rate (WER) was manually reviewed.</p><p><strong>Results: </strong>The mean WER was 1.75% for Whisper and 1.54% for Dragon (p = 0.080). When excluding linguistic errors, Whisper outperformed Dragon with a mean WER of 0.50% versus 1.34% (p < 0.001), including the mean number of total errors (Whisper: 6.1, Dragon: 9.7; p = 0.002). For all unstratified dictations, a positive correlation was seen between total errors and word count (p < 0.001, R2 = 0.37), as well as total errors and recording length (p < 0.001, R2 = 0.22). A positive correlation was noted between words spoken per second and total errors for Dragon (p = 0.020, R2 = 0.18), but not for Whisper (p = 0.205, R2 = 0.06). Similarly, when analyzing linguistic errors only, this trend held for Dragon (p = 0.014, R2 = 0.20), but not for Whisper (p = 0.331, R2 = 0.03).</p><p><strong>Conclusions: </strong>An AI-based model performed at a noninferior rate compared to a commercially available speech-to-text dictation program. Generative models provide potential benefits such as contextual inference that show promise in limiting errors with increased dictation speed or adjustment for impure input data.</p>","PeriodicalId":19187,"journal":{"name":"Neurosurgical focus","volume":"59 1","pages":"E8"},"PeriodicalIF":3.3,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144541497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}