Ibrahim Mohammadzadeh, Bardia Hajikarimloo, Behnaz Niroomand, Nasira Faizi, Pooya Eini, Mohammad Amin Habibi, Alireza Mohseni, Mohammadmahdi Sabahi, Abdulrahman Albakr, Michael Karsy, Hamid Borghei-Razavi
{"title":"Prediction of recurrence after surgery for pituitary adenoma using machine learning- based models: systematic review and meta-analysis.","authors":"Ibrahim Mohammadzadeh, Bardia Hajikarimloo, Behnaz Niroomand, Nasira Faizi, Pooya Eini, Mohammad Amin Habibi, Alireza Mohseni, Mohammadmahdi Sabahi, Abdulrahman Albakr, Michael Karsy, Hamid Borghei-Razavi","doi":"10.1186/s12902-025-01955-8","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Predicting pituitary adenoma (PA) recurrence after surgical resection is critical for guiding clinical decision-making, and machine learning (ML) based models show great promise in improving the accuracy of these predictions. These models can provide valuable insights to surgeons and oncologists, helping them tailor personalized treatment plans, enhance patient prognostication, and optimize follow-up strategies.</p><p><strong>Methods: </strong>We systematically searched PubMed, Scopus, Embase, Cochrane Library, and Web of Science databases until November 2024, applying PRISMA guidelines.</p><p><strong>Results: </strong>Out of 1240 studies screened, six met our eligibility criteria involving ML-based approaches to predict PA recurrence. The studies employed 12 different ML algorithms. Meta-analysis showed a pooled sensitivity of 0.87 [95% CI: 0.78-0.92], specificity of 0.86 [95% CI: 0.67-0.95], positive diagnostic likelihood ratio (DLR) of 6.32 [95% CI: 2.46-16.26], and negative DLR of 0.16 [95% CI: 0.1-0.25]. The diagnostic odds ratio (DOR) was 40.52 [95% CI: 13-126.27], and the diagnostic score was 3.7 [95% CI: 2.57-4.84]. The pooled AUC was 0.89 [95% CI: 0.86-0.92], indicating a high overall diagnostic performance. For the comparison between Logistic Regression (LR) and non-LR algorithms, LR-based algorithms exhibited numerically higher AUC and sensitivity; however, these differences were not statistically significant. Additionally, LR-based algorithms showed lower specificity, positive likelihood ratio, and diagnostic odds ratios, but the statistical tests did not provide strong evidence for meaningful differences.</p><p><strong>Conclusion: </strong>AI-based models show strong predictive power for recurrence in both functional and non-functional pituitary adenomas, with an average accuracy above 80%. However, the lack of external validation and the complexity of input data pose challenges, highlighting the need for rigorous validation with multi-center datasets and standardized imaging techniques to enhance clinical applicability.</p>","PeriodicalId":9152,"journal":{"name":"BMC Endocrine Disorders","volume":"25 1","pages":"158"},"PeriodicalIF":2.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12219454/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Endocrine Disorders","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12902-025-01955-8","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Predicting pituitary adenoma (PA) recurrence after surgical resection is critical for guiding clinical decision-making, and machine learning (ML) based models show great promise in improving the accuracy of these predictions. These models can provide valuable insights to surgeons and oncologists, helping them tailor personalized treatment plans, enhance patient prognostication, and optimize follow-up strategies.
Methods: We systematically searched PubMed, Scopus, Embase, Cochrane Library, and Web of Science databases until November 2024, applying PRISMA guidelines.
Results: Out of 1240 studies screened, six met our eligibility criteria involving ML-based approaches to predict PA recurrence. The studies employed 12 different ML algorithms. Meta-analysis showed a pooled sensitivity of 0.87 [95% CI: 0.78-0.92], specificity of 0.86 [95% CI: 0.67-0.95], positive diagnostic likelihood ratio (DLR) of 6.32 [95% CI: 2.46-16.26], and negative DLR of 0.16 [95% CI: 0.1-0.25]. The diagnostic odds ratio (DOR) was 40.52 [95% CI: 13-126.27], and the diagnostic score was 3.7 [95% CI: 2.57-4.84]. The pooled AUC was 0.89 [95% CI: 0.86-0.92], indicating a high overall diagnostic performance. For the comparison between Logistic Regression (LR) and non-LR algorithms, LR-based algorithms exhibited numerically higher AUC and sensitivity; however, these differences were not statistically significant. Additionally, LR-based algorithms showed lower specificity, positive likelihood ratio, and diagnostic odds ratios, but the statistical tests did not provide strong evidence for meaningful differences.
Conclusion: AI-based models show strong predictive power for recurrence in both functional and non-functional pituitary adenomas, with an average accuracy above 80%. However, the lack of external validation and the complexity of input data pose challenges, highlighting the need for rigorous validation with multi-center datasets and standardized imaging techniques to enhance clinical applicability.
期刊介绍:
BMC Endocrine Disorders is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of endocrine disorders, as well as related molecular genetics, pathophysiology, and epidemiology.