A Population-Specific Ensemble Machine Learning Model for Predicting Borderline or Malignancy Risk of Ovarian Masses in Macao: A Multicenter Retrospective Study.
Chan-Fong Chio, Lai-Fong Sin, Hoi-Sun Loi, Hou-Kong Cheang, I-San Chan, Shunjia Hong, Wai-Ieng Fong, Kin-Iong Chan, Sio-In Wong
{"title":"A Population-Specific Ensemble Machine Learning Model for Predicting Borderline or Malignancy Risk of Ovarian Masses in Macao: A Multicenter Retrospective Study.","authors":"Chan-Fong Chio, Lai-Fong Sin, Hoi-Sun Loi, Hou-Kong Cheang, I-San Chan, Shunjia Hong, Wai-Ieng Fong, Kin-Iong Chan, Sio-In Wong","doi":"10.1177/11795549251388312","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Preoperative discrimination between benign and malignant ovarian tumors is important. The applicability of published prediction tools may be limited across different health systems. We aim to develop a machine learning model specifically for Macao's population to predict the borderline or malignancy risk of ovarian masses using routinely available clinical data in Macao's health system.</p><p><strong>Methods: </strong>The study cohorts were derived from 2 major hospitals in Macao, including 496 patients who underwent oophorectomy or cystectomy for ovarian masses at CHCSJ between January 2014 and December 2023, along with a simulated prospective cohort of 95 patients from CHCSJ between January 2024 and November 2024, and an external validation cohort of 61 patients from KWH between January 2020 and September 2024. Patients' clinical information, ultrasound features, and laboratory test results before initial treatment were collected. LASSO regression was used for feature selection, and classifiers were developed using various machine learning algorithms. The predictions were compared with postoperative pathological diagnoses. The predictive performance was also compared with the RMI-4.</p><p><strong>Results: </strong>Age, menopausal status, 5 ultrasound features, and 7 laboratory tests were identified as predictors of borderline and malignant ovarian tumors. An ensemble learning model based on a voting classifier was selected as the final model. Our model outperformed RMI-4 in the internal test set, simulated prospective cohort, and external validation cohort, achieving an area under the curve (AUC) of 0.923-0.951 (vs 0.810-0.868, <i>P</i> < .05). Decision curve analysis demonstrated superior clinical utility, and SHAP analysis confirmed its interpretability.</p><p><strong>Conclusions: </strong>We propose a machine learning model targeting Macao's population for predicting the borderline or malignancy risk of ovarian masses. Our model is accurate, low-cost, easily accessible, and interpretable. On the basis of no workflow changes, machine learning techniques can maximize the predictive potential of routinely available clinical data in a specific health system.</p>","PeriodicalId":48591,"journal":{"name":"Clinical Medicine Insights-Oncology","volume":"19 ","pages":"11795549251388312"},"PeriodicalIF":1.9000,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12965330/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Medicine Insights-Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/11795549251388312","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Preoperative discrimination between benign and malignant ovarian tumors is important. The applicability of published prediction tools may be limited across different health systems. We aim to develop a machine learning model specifically for Macao's population to predict the borderline or malignancy risk of ovarian masses using routinely available clinical data in Macao's health system.
Methods: The study cohorts were derived from 2 major hospitals in Macao, including 496 patients who underwent oophorectomy or cystectomy for ovarian masses at CHCSJ between January 2014 and December 2023, along with a simulated prospective cohort of 95 patients from CHCSJ between January 2024 and November 2024, and an external validation cohort of 61 patients from KWH between January 2020 and September 2024. Patients' clinical information, ultrasound features, and laboratory test results before initial treatment were collected. LASSO regression was used for feature selection, and classifiers were developed using various machine learning algorithms. The predictions were compared with postoperative pathological diagnoses. The predictive performance was also compared with the RMI-4.
Results: Age, menopausal status, 5 ultrasound features, and 7 laboratory tests were identified as predictors of borderline and malignant ovarian tumors. An ensemble learning model based on a voting classifier was selected as the final model. Our model outperformed RMI-4 in the internal test set, simulated prospective cohort, and external validation cohort, achieving an area under the curve (AUC) of 0.923-0.951 (vs 0.810-0.868, P < .05). Decision curve analysis demonstrated superior clinical utility, and SHAP analysis confirmed its interpretability.
Conclusions: We propose a machine learning model targeting Macao's population for predicting the borderline or malignancy risk of ovarian masses. Our model is accurate, low-cost, easily accessible, and interpretable. On the basis of no workflow changes, machine learning techniques can maximize the predictive potential of routinely available clinical data in a specific health system.
期刊介绍:
Clinical Medicine Insights: Oncology is an international, peer-reviewed, open access journal that focuses on all aspects of cancer research and treatment, in addition to related genetic, pathophysiological and epidemiological topics. Of particular but not exclusive importance are molecular biology, clinical interventions, controlled trials, therapeutics, pharmacology and drug delivery, and techniques of cancer surgery. The journal welcomes unsolicited article proposals.