Ali A Mohamed, Daniel Colome, Jack Yang, Emma C Sargent, Gabriel Flores-Milan, Zachary Sorrentino, Akshay Sharma, Owoicho Adogwa, Stephen Pirris, Brandon Lucke-Wold
{"title":"Leveraging artificial intelligence in the peer review of neurosurgical research articles.","authors":"Ali A Mohamed, Daniel Colome, Jack Yang, Emma C Sargent, Gabriel Flores-Milan, Zachary Sorrentino, Akshay Sharma, Owoicho Adogwa, Stephen Pirris, Brandon Lucke-Wold","doi":"10.1007/s10143-025-03783-9","DOIUrl":null,"url":null,"abstract":"<p><p>The traditional peer review process is time-consuming and can delay the dissemination of critical research. This study evaluates the effectiveness of artificial intelligence (AI) in predicting the acceptance or rejection of neurosurgical manuscripts, offering a possible solution to optimize the process. Neurosurgical preprints from Preprint.org and medRxiv.org were analyzed. Published preprints were compared to those presumed not accepted after remaining on preprint servers for over 12 months. Each article was uploaded to ChatGPT 4o, Gemini, and Copilot with the prompt: \"Based on the literature up to the date this article was posted, will it be accepted or rejected for publication following peer review? Please provide a yes or no answer.\" AI predictive accuracy and journal metrics were assessed between preprints that were accepted or presumed to be not accepted. A total of 51 preprints (31 skull base, 20 spine) were included, with 28 published and 23 presumed not accepted. The average impact factor and cite score for accepted preprints were 4.36 ± 2.07 and 6.38 ± 3.67 for skull base and 3.48 ± 1.08 and 4.83 ± 1.37 for spine topics. Across all AI models, there were no significant differences in journal metrics between preprints predicted to be accepted or not accepted (p > 0.05). Overall, AI models had significantly low performance, with accuracy ranging from 40 to 66.67% (p < 0.001). Current AI models exhibit moderate accuracy in predicting peer review outcomes. Future AI models, developed in collaboration with journals and with authors' consent, could access a more balanced dataset, enhancing accuracy and streamlining the peer review process.</p>","PeriodicalId":19184,"journal":{"name":"Neurosurgical Review","volume":"48 1","pages":"631"},"PeriodicalIF":2.5000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurosurgical Review","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10143-025-03783-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The traditional peer review process is time-consuming and can delay the dissemination of critical research. This study evaluates the effectiveness of artificial intelligence (AI) in predicting the acceptance or rejection of neurosurgical manuscripts, offering a possible solution to optimize the process. Neurosurgical preprints from Preprint.org and medRxiv.org were analyzed. Published preprints were compared to those presumed not accepted after remaining on preprint servers for over 12 months. Each article was uploaded to ChatGPT 4o, Gemini, and Copilot with the prompt: "Based on the literature up to the date this article was posted, will it be accepted or rejected for publication following peer review? Please provide a yes or no answer." AI predictive accuracy and journal metrics were assessed between preprints that were accepted or presumed to be not accepted. A total of 51 preprints (31 skull base, 20 spine) were included, with 28 published and 23 presumed not accepted. The average impact factor and cite score for accepted preprints were 4.36 ± 2.07 and 6.38 ± 3.67 for skull base and 3.48 ± 1.08 and 4.83 ± 1.37 for spine topics. Across all AI models, there were no significant differences in journal metrics between preprints predicted to be accepted or not accepted (p > 0.05). Overall, AI models had significantly low performance, with accuracy ranging from 40 to 66.67% (p < 0.001). Current AI models exhibit moderate accuracy in predicting peer review outcomes. Future AI models, developed in collaboration with journals and with authors' consent, could access a more balanced dataset, enhancing accuracy and streamlining the peer review process.
期刊介绍:
The goal of Neurosurgical Review is to provide a forum for comprehensive reviews on current issues in neurosurgery. Each issue contains up to three reviews, reflecting all important aspects of one topic (a disease or a surgical approach). Comments by a panel of experts within the same issue complete the topic. By providing comprehensive coverage of one topic per issue, Neurosurgical Review combines the topicality of professional journals with the indepth treatment of a monograph. Original papers of high quality are also welcome.