Development of a Machine Learning Algorithm to Predict Abnormalities in Serum Phosphate in a Large Oncology Cohort.

IF 2.8 Q2 ONCOLOGY

JCO Clinical Cancer Informatics Pub Date : 2025-04-01 Epub Date: 2025-04-11 DOI:10.1200/CCI-24-00312

Lauren A Scanlon, Phillip J Monaghan, Safwaan Adam

{"title":"Development of a Machine Learning Algorithm to Predict Abnormalities in Serum Phosphate in a Large Oncology Cohort.","authors":"Lauren A Scanlon, Phillip J Monaghan, Safwaan Adam","doi":"10.1200/CCI-24-00312","DOIUrl":null,"url":null,"abstract":"Purpose: Serum phosphate is commonly measured in oncology patients because of the relationship between oncologic conditions and treatments with abnormal phosphate. All patients attending our institution, a large specialist oncology center, have a standardized order set (SOS) measured. This consists of 15 biochemical tests, including serum phosphate. Our aim was to understand if abnormalities in serum phosphate could be predicted, using a machine learning algorithm (MLA) by other interrelated variables in the SOS.Methods: We trained an XGBoost MLA implemented in Python to predict occurrence of abnormal phosphate (<0.5 or >1.78 mmol/L) from other results in the SOS. To train and test this algorithm, we used 481,150 test results for 45,174 patients on blood tests between January 2019 and December 2021, with 5,897 abnormal results.Results: This model was trained and tested on a 70%/30% split (train/test result cohort), achieving an area under the receiver operator curve on the test set of 0.866 (95% CI, 0.857 to 0.875). Assigning a threshold for predictions so the model achieves a sensitivity of 0.924 and a specificity of 0.530 and only performing a phosphate test for results above this threshold, the number of phosphate tests would be reduced from 142,647 to 67,873 in this test set, capturing 1,586 of the total 1,716 abnormal results with a small risk (<0.1%) of missing an abnormal result. The model was further validated on a separate validation cohort between January 2022 and December 2023, achieving similar levels of performance.Conclusion: A MLA to optimize testing of phosphate has been developed with high sensitivity. Its application in routine care might result in cost-savings and health care efficiencies. The methodology used to develop our MLA model can be applied to other settings where interrelated variables are measured in SOS.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400312"},"PeriodicalIF":2.8000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00312","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/11 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: Serum phosphate is commonly measured in oncology patients because of the relationship between oncologic conditions and treatments with abnormal phosphate. All patients attending our institution, a large specialist oncology center, have a standardized order set (SOS) measured. This consists of 15 biochemical tests, including serum phosphate. Our aim was to understand if abnormalities in serum phosphate could be predicted, using a machine learning algorithm (MLA) by other interrelated variables in the SOS.

Methods: We trained an XGBoost MLA implemented in Python to predict occurrence of abnormal phosphate (<0.5 or >1.78 mmol/L) from other results in the SOS. To train and test this algorithm, we used 481,150 test results for 45,174 patients on blood tests between January 2019 and December 2021, with 5,897 abnormal results.

Results: This model was trained and tested on a 70%/30% split (train/test result cohort), achieving an area under the receiver operator curve on the test set of 0.866 (95% CI, 0.857 to 0.875). Assigning a threshold for predictions so the model achieves a sensitivity of 0.924 and a specificity of 0.530 and only performing a phosphate test for results above this threshold, the number of phosphate tests would be reduced from 142,647 to 67,873 in this test set, capturing 1,586 of the total 1,716 abnormal results with a small risk (<0.1%) of missing an abnormal result. The model was further validated on a separate validation cohort between January 2022 and December 2023, achieving similar levels of performance.

Conclusion: A MLA to optimize testing of phosphate has been developed with high sensitivity. Its application in routine care might result in cost-savings and health care efficiencies. The methodology used to develop our MLA model can be applied to other settings where interrelated variables are measured in SOS.

查看原文本刊更多论文

一种机器学习算法的发展，以预测一个大型肿瘤队列中血清磷酸盐的异常。

目的：由于肿瘤病情与治疗之间的关系，通常在肿瘤患者中测量血清磷酸盐。我们是一家大型肿瘤专科中心，所有来我院就诊的患者都有一个标准化的订单集（SOS）测量。这包括15项生化试验，包括血清磷酸盐。我们的目的是了解是否可以通过SOS中的其他相关变量使用机器学习算法（MLA）预测血清磷酸盐的异常。方法：我们训练了一个Python实现的XGBoost MLA，从SOS的其他结果中预测异常磷酸盐（1.78 mmol/L）的发生。为了训练和测试该算法，我们在2019年1月至2021年12月期间对45174名患者的血液检查使用了481150个测试结果，其中有5897个异常结果。结果：该模型以70%/30%分割（训练/测试结果队列）进行训练和测试，在测试集上获得接收者操作符曲线下的面积为0.866 （95% CI， 0.857至0.875）。为预测设定一个阈值，使模型的灵敏度达到0.924，特异性为0.530，并且只对高于该阈值的结果进行磷酸盐测试，磷酸盐测试的数量将从该测试集中的142,647个减少到67,873个，捕获总数为1,716个异常结果中的1,586个，风险很小(结论：开发了一个具有高灵敏度的优化磷酸盐测试的MLA。它在日常护理中的应用可能会节省成本和提高医疗效率。用于开发我们的MLA模型的方法可以应用于在SOS中测量相关变量的其他设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

JCO Clinical Cancer Informatics ONCOLOGY-

CiteScore

6.20

自引率

4.80%

发文量

190