A Population-Specific Ensemble Machine Learning Model for Predicting Borderline or Malignancy Risk of Ovarian Masses in Macao: A Multicenter Retrospective Study.

IF 1.9 4区 医学 Q3 ONCOLOGY
Clinical Medicine Insights-Oncology Pub Date : 2025-11-03 eCollection Date: 2025-01-01 DOI:10.1177/11795549251388312
Chan-Fong Chio, Lai-Fong Sin, Hoi-Sun Loi, Hou-Kong Cheang, I-San Chan, Shunjia Hong, Wai-Ieng Fong, Kin-Iong Chan, Sio-In Wong
{"title":"A Population-Specific Ensemble Machine Learning Model for Predicting Borderline or Malignancy Risk of Ovarian Masses in Macao: A Multicenter Retrospective Study.","authors":"Chan-Fong Chio, Lai-Fong Sin, Hoi-Sun Loi, Hou-Kong Cheang, I-San Chan, Shunjia Hong, Wai-Ieng Fong, Kin-Iong Chan, Sio-In Wong","doi":"10.1177/11795549251388312","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Preoperative discrimination between benign and malignant ovarian tumors is important. The applicability of published prediction tools may be limited across different health systems. We aim to develop a machine learning model specifically for Macao's population to predict the borderline or malignancy risk of ovarian masses using routinely available clinical data in Macao's health system.</p><p><strong>Methods: </strong>The study cohorts were derived from 2 major hospitals in Macao, including 496 patients who underwent oophorectomy or cystectomy for ovarian masses at CHCSJ between January 2014 and December 2023, along with a simulated prospective cohort of 95 patients from CHCSJ between January 2024 and November 2024, and an external validation cohort of 61 patients from KWH between January 2020 and September 2024. Patients' clinical information, ultrasound features, and laboratory test results before initial treatment were collected. LASSO regression was used for feature selection, and classifiers were developed using various machine learning algorithms. The predictions were compared with postoperative pathological diagnoses. The predictive performance was also compared with the RMI-4.</p><p><strong>Results: </strong>Age, menopausal status, 5 ultrasound features, and 7 laboratory tests were identified as predictors of borderline and malignant ovarian tumors. An ensemble learning model based on a voting classifier was selected as the final model. Our model outperformed RMI-4 in the internal test set, simulated prospective cohort, and external validation cohort, achieving an area under the curve (AUC) of 0.923-0.951 (vs 0.810-0.868, <i>P</i> < .05). Decision curve analysis demonstrated superior clinical utility, and SHAP analysis confirmed its interpretability.</p><p><strong>Conclusions: </strong>We propose a machine learning model targeting Macao's population for predicting the borderline or malignancy risk of ovarian masses. Our model is accurate, low-cost, easily accessible, and interpretable. On the basis of no workflow changes, machine learning techniques can maximize the predictive potential of routinely available clinical data in a specific health system.</p>","PeriodicalId":48591,"journal":{"name":"Clinical Medicine Insights-Oncology","volume":"19 ","pages":"11795549251388312"},"PeriodicalIF":1.9000,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12965330/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Medicine Insights-Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/11795549251388312","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Preoperative discrimination between benign and malignant ovarian tumors is important. The applicability of published prediction tools may be limited across different health systems. We aim to develop a machine learning model specifically for Macao's population to predict the borderline or malignancy risk of ovarian masses using routinely available clinical data in Macao's health system.

Methods: The study cohorts were derived from 2 major hospitals in Macao, including 496 patients who underwent oophorectomy or cystectomy for ovarian masses at CHCSJ between January 2014 and December 2023, along with a simulated prospective cohort of 95 patients from CHCSJ between January 2024 and November 2024, and an external validation cohort of 61 patients from KWH between January 2020 and September 2024. Patients' clinical information, ultrasound features, and laboratory test results before initial treatment were collected. LASSO regression was used for feature selection, and classifiers were developed using various machine learning algorithms. The predictions were compared with postoperative pathological diagnoses. The predictive performance was also compared with the RMI-4.

Results: Age, menopausal status, 5 ultrasound features, and 7 laboratory tests were identified as predictors of borderline and malignant ovarian tumors. An ensemble learning model based on a voting classifier was selected as the final model. Our model outperformed RMI-4 in the internal test set, simulated prospective cohort, and external validation cohort, achieving an area under the curve (AUC) of 0.923-0.951 (vs 0.810-0.868, P < .05). Decision curve analysis demonstrated superior clinical utility, and SHAP analysis confirmed its interpretability.

Conclusions: We propose a machine learning model targeting Macao's population for predicting the borderline or malignancy risk of ovarian masses. Our model is accurate, low-cost, easily accessible, and interpretable. On the basis of no workflow changes, machine learning techniques can maximize the predictive potential of routinely available clinical data in a specific health system.

Abstract Image

Abstract Image

Abstract Image

预测澳门卵巢肿块临界或恶性风险的人群特异性集成机器学习模型:一项多中心回顾性研究。
背景:术前鉴别卵巢良恶性肿瘤很重要。已发表的预测工具在不同卫生系统中的适用性可能有限。我们的目标是开发一个专门针对澳门人口的机器学习模型,利用澳门卫生系统中常规可用的临床数据来预测卵巢肿块的边缘或恶性风险。方法:研究队列来自澳门两大医院,包括2014年1月至2023年12月期间在CHCSJ接受卵巢肿块切除术或膀胱切除术的496例患者,以及2024年1月至2024年11月期间CHCSJ的95例患者的模拟前瞻性队列,以及2020年1月至2024年9月期间KWH的61例患者的外部验证队列。收集患者初始治疗前的临床资料、超声特征及实验室检查结果。使用LASSO回归进行特征选择,并使用各种机器学习算法开发分类器。将预测结果与术后病理诊断结果进行比较。预测性能也与RMI-4进行了比较。结果:年龄、绝经状态、5项超声特征和7项实验室检查被确定为交界性和恶性卵巢肿瘤的预测因素。最后选择基于投票分类器的集成学习模型作为最终模型。我们的模型在内部测试集、模拟前瞻性队列和外部验证队列中均优于RMI-4,曲线下面积(AUC)为0.923-0.951 (vs 0.810-0.868, P)。结论:我们提出了一种针对澳门人群的机器学习模型,用于预测卵巢肿块的临界或恶性风险。我们的模型准确、低成本、易于获取和解释。在不改变工作流程的基础上,机器学习技术可以最大限度地发挥特定卫生系统中常规可用临床数据的预测潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.40
自引率
4.50%
发文量
57
审稿时长
8 weeks
期刊介绍: Clinical Medicine Insights: Oncology is an international, peer-reviewed, open access journal that focuses on all aspects of cancer research and treatment, in addition to related genetic, pathophysiological and epidemiological topics. Of particular but not exclusive importance are molecular biology, clinical interventions, controlled trials, therapeutics, pharmacology and drug delivery, and techniques of cancer surgery. The journal welcomes unsolicited article proposals.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书