Construction of the prediction model for multiple myeloma based on machine learning

IF 2.3 4区医学 Q3 HEMATOLOGY

International Journal of Laboratory Hematology Pub Date : 2024-05-31 DOI:10.1111/ijlh.14324

Jiangying Cai, Zhenhua Liu, Yingying Wang, Wanxia Yang, Zhipeng Sun, Chongge You

{"title":"Construction of the prediction model for multiple myeloma based on machine learning","authors":"Jiangying Cai, Zhenhua Liu, Yingying Wang, Wanxia Yang, Zhipeng Sun, Chongge You","doi":"10.1111/ijlh.14324","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>The global burden of multiple myeloma (MM) is increasing every year. Here, we have developed machine learning models to provide a reference for the early detection of MM.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A total of 465 patients and 150 healthy controls were enrolled in this retrospective study. Based on the variable screening strategy of least absolute shrinkage and selection operator (LASSO), three prediction models, logistic regression (LR), support vector machine (SVM), and random forest (RF), were established combining complete blood count (CBC) and cell population data (CPD) parameters in the training set (210 cases), and were verified in the validation set (90 cases) and test set (165 cases). The performance of each model was analyzed using receiver operating characteristic (ROC) curve, calibration curves, and decision curve analysis (DCA). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC) were applied to evaluate the models. Delong test was used to compare the AUC of the models.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Six parameters including RBC (10<sup>12</sup>/L), RDW-CV (%), IG (%), NE-WZ, LY-WX, and LY-WZ were screened out by LASSO to construct the model. Among the three models, the AUC of RF model in the training set, validation set, and test set were 0.956, 0.892, and 0.875, which were higher than those of LR model (0.901, 0.849, and 0.858) and SVM model (0.929, 0.868, and 0.846). Delong test showed that there were significant differences among the models in the training set, no significant differences in the validation set, and significant differences only between SVM and RF models in the test set. The calibration curve and DCA showed that the three models had good validity and feasibility, and the RF model performed best.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The proposed RF model may be a useful auxiliary tool for rapid screening of MM patients.</p>\n </section>\n </div>","PeriodicalId":14120,"journal":{"name":"International Journal of Laboratory Hematology","volume":"46 5","pages":"918-926"},"PeriodicalIF":2.3000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Laboratory Hematology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ijlh.14324","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEMATOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Introduction

The global burden of multiple myeloma (MM) is increasing every year. Here, we have developed machine learning models to provide a reference for the early detection of MM.

Methods

A total of 465 patients and 150 healthy controls were enrolled in this retrospective study. Based on the variable screening strategy of least absolute shrinkage and selection operator (LASSO), three prediction models, logistic regression (LR), support vector machine (SVM), and random forest (RF), were established combining complete blood count (CBC) and cell population data (CPD) parameters in the training set (210 cases), and were verified in the validation set (90 cases) and test set (165 cases). The performance of each model was analyzed using receiver operating characteristic (ROC) curve, calibration curves, and decision curve analysis (DCA). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC) were applied to evaluate the models. Delong test was used to compare the AUC of the models.

Results

Six parameters including RBC (10¹²/L), RDW-CV (%), IG (%), NE-WZ, LY-WX, and LY-WZ were screened out by LASSO to construct the model. Among the three models, the AUC of RF model in the training set, validation set, and test set were 0.956, 0.892, and 0.875, which were higher than those of LR model (0.901, 0.849, and 0.858) and SVM model (0.929, 0.868, and 0.846). Delong test showed that there were significant differences among the models in the training set, no significant differences in the validation set, and significant differences only between SVM and RF models in the test set. The calibration curve and DCA showed that the three models had good validity and feasibility, and the RF model performed best.

Conclusion

The proposed RF model may be a useful auxiliary tool for rapid screening of MM patients.

查看原文本刊更多论文

基于机器学习构建多发性骨髓瘤预测模型。

导言：全球多发性骨髓瘤（MM）的发病率逐年上升。在此，我们开发了机器学习模型，为早期检测多发性骨髓瘤提供参考：这项回顾性研究共纳入了 465 名患者和 150 名健康对照者。基于最小绝对收缩和选择算子（LASSO）的变量筛选策略，结合训练集（210 例）中的全血细胞计数（CBC）和细胞群数据（CPD）参数，建立了逻辑回归（LR）、支持向量机（SVM）和随机森林（RF）三种预测模型，并在验证集（90 例）和测试集（165 例）中进行了验证。利用接收者操作特征曲线（ROC）、校准曲线和决策曲线分析（DCA）对每个模型的性能进行了分析。准确度、灵敏度、特异性、阳性预测值、阴性预测值和 ROC 曲线下面积（AUC）用于评估模型。德朗检验用于比较模型的AUC：通过 LASSO 筛选出 RBC (1012/L)、RDW-CV (%)、IG (%)、NE-WZ、LY-WX 和 LY-WZ 等六个参数构建模型。在三个模型中，RF 模型在训练集、验证集和测试集中的 AUC 分别为 0.956、0.892 和 0.875，高于 LR 模型（0.901、0.849 和 0.858）和 SVM 模型（0.929、0.868 和 0.846）。Delong 检验表明，在训练集中各模型之间存在显著差异，在验证集中无显著差异，在测试集中仅 SVM 模型和 RF 模型之间存在显著差异。校准曲线和 DCA 表明，三个模型都具有良好的有效性和可行性，其中 RF 模型表现最佳：结论：所提出的 RF 模型可能是快速筛查 MM 患者的有用辅助工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Laboratory Hematology 医学-血液学

CiteScore

4.50

自引率

6.70%

发文量

211

审稿时长

6-12 weeks

期刊介绍： The International Journal of Laboratory Hematology provides a forum for the communication of new developments, research topics and the practice of laboratory haematology. The journal publishes invited reviews, full length original articles, and correspondence. The International Journal of Laboratory Hematology is the official journal of the International Society for Laboratory Hematology, which addresses the following sub-disciplines: cellular analysis, flow cytometry, haemostasis and thrombosis, molecular diagnostics, haematology informatics, haemoglobinopathies, point of care testing, standards and guidelines. The journal was launched in 2006 as the successor to Clinical and Laboratory Hematology, which was first published in 1979. An active and positive editorial policy ensures that work of a high scientific standard is reported, in order to bridge the gap between practical and academic aspects of laboratory haematology.