Machine Learning Model Predicts Abnormal Lymphocytosis Associated With Chronic Lymphocytic Leukemia.

IF 3.3 Q2 ONCOLOGY
JCO Clinical Cancer Informatics Pub Date : 2025-06-01 Epub Date: 2025-06-24 DOI:10.1200/CCI-24-00197
Joseph Aoki, Omar Khalid, Cihan Kaya, Mohamed E Salama
{"title":"Machine Learning Model Predicts Abnormal Lymphocytosis Associated With Chronic Lymphocytic Leukemia.","authors":"Joseph Aoki, Omar Khalid, Cihan Kaya, Mohamed E Salama","doi":"10.1200/CCI-24-00197","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The diagnosis of chronic lymphocytic leukemia (CLL) is often delayed several years in advance of disease. Addressing this care gap would aid in identifying at-risk patients who may benefit from targeted evaluation to prevent adverse outcomes. To our knowledge, to date, however, there are no widely utilized machine learning (ML) models that predict development of CLL. Therefore, the objective of this study was to leverage readily available laboratory data to train and test the performance of ML-based risk models for abnormal lymphocytosis associated with CLL.</p><p><strong>Methods: </strong>The observational study population was composed of deidentified laboratory data procured from a large US outpatient network. The 7-year longitudinal data set included 1,090,707 adult patients with the following inclusion criteria: age 50 to 75 years and initial absolute lymphocyte count (ALC) <5 × 10<sup>9</sup>/L. The data set was split into training and held-out test sets, where 80% of the data were used in training and 20% were used for independent testing. ML models were developed using random forest survival methods. The ground truth outcome was abnormal lymphocytosis associated with CLL and monoclonal B-cell lymphocytosis diagnosis: ALC ≥5 × 10<sup>9</sup>/L with ≥40% relative lymphocytosis.</p><p><strong>Results: </strong>The 12-variable risk classifier model accurately predicted ALC ≥5 × 10<sup>9</sup>/L within 5 years and achieved an area under the curve receiver operating characteristic of 0.92. The most important predictors were ALC (initial, slope), WBC (last, max, slope, initial), platelet (last, slope, max, initial), age, and sex.</p><p><strong>Conclusion: </strong>Our ML risk classifier accurately predicts abnormal lymphocytosis associated with CLL using routine laboratory data. Although prospective studies are warranted, the results support the clinical utility of the model to improve timely recognition for patients at a risk of CLL.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"9 ","pages":"e2400197"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12184979/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI-24-00197","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/24 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose: The diagnosis of chronic lymphocytic leukemia (CLL) is often delayed several years in advance of disease. Addressing this care gap would aid in identifying at-risk patients who may benefit from targeted evaluation to prevent adverse outcomes. To our knowledge, to date, however, there are no widely utilized machine learning (ML) models that predict development of CLL. Therefore, the objective of this study was to leverage readily available laboratory data to train and test the performance of ML-based risk models for abnormal lymphocytosis associated with CLL.

Methods: The observational study population was composed of deidentified laboratory data procured from a large US outpatient network. The 7-year longitudinal data set included 1,090,707 adult patients with the following inclusion criteria: age 50 to 75 years and initial absolute lymphocyte count (ALC) <5 × 109/L. The data set was split into training and held-out test sets, where 80% of the data were used in training and 20% were used for independent testing. ML models were developed using random forest survival methods. The ground truth outcome was abnormal lymphocytosis associated with CLL and monoclonal B-cell lymphocytosis diagnosis: ALC ≥5 × 109/L with ≥40% relative lymphocytosis.

Results: The 12-variable risk classifier model accurately predicted ALC ≥5 × 109/L within 5 years and achieved an area under the curve receiver operating characteristic of 0.92. The most important predictors were ALC (initial, slope), WBC (last, max, slope, initial), platelet (last, slope, max, initial), age, and sex.

Conclusion: Our ML risk classifier accurately predicts abnormal lymphocytosis associated with CLL using routine laboratory data. Although prospective studies are warranted, the results support the clinical utility of the model to improve timely recognition for patients at a risk of CLL.

机器学习模型预测与慢性淋巴细胞白血病相关的异常淋巴细胞增多。
目的:慢性淋巴细胞白血病(CLL)的诊断常常在发病前延迟数年。解决这一护理差距将有助于识别高危患者,这些患者可能受益于有针对性的评估,以预防不良后果。然而,据我们所知,到目前为止,还没有广泛使用的机器学习(ML)模型来预测CLL的发展。因此,本研究的目的是利用现成的实验室数据来训练和测试基于ml的与CLL相关的异常淋巴细胞增多症风险模型的性能。方法:观察性研究人群由从美国大型门诊网络获得的未识别实验室数据组成。7年的纵向数据集包括1,090,707名成年患者,纳入标准如下:年龄50至75岁,初始绝对淋巴细胞计数(ALC) 9/L。数据集分为训练集和测试集,其中80%的数据用于训练,20%用于独立测试。ML模型采用随机森林生存法建立。基本真实结果为与CLL相关的异常淋巴细胞增多和单克隆b细胞增多诊断:ALC≥5 × 109/L,相对淋巴细胞增多≥40%。结果:12变量风险分类模型准确预测5年内ALC≥5 × 109/L,曲线下面积接收者工作特征值为0.92。最重要的预测因子是ALC(初始值、斜率)、WBC(最后值、最大值、斜率、初始值)、血小板(最后值、斜率、最大值、初始值)、年龄和性别。结论:我们的ML风险分类器使用常规实验室数据准确预测与CLL相关的异常淋巴细胞增多。虽然前瞻性研究是必要的,但结果支持该模型的临床应用,以提高对有CLL风险的患者的及时识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.20
自引率
4.80%
发文量
190
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信