{"title":"基于机器学习构建多发性骨髓瘤预测模型。","authors":"Jiangying Cai, Zhenhua Liu, Yingying Wang, Wanxia Yang, Zhipeng Sun, Chongge You","doi":"10.1111/ijlh.14324","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>The global burden of multiple myeloma (MM) is increasing every year. Here, we have developed machine learning models to provide a reference for the early detection of MM.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A total of 465 patients and 150 healthy controls were enrolled in this retrospective study. Based on the variable screening strategy of least absolute shrinkage and selection operator (LASSO), three prediction models, logistic regression (LR), support vector machine (SVM), and random forest (RF), were established combining complete blood count (CBC) and cell population data (CPD) parameters in the training set (210 cases), and were verified in the validation set (90 cases) and test set (165 cases). The performance of each model was analyzed using receiver operating characteristic (ROC) curve, calibration curves, and decision curve analysis (DCA). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC) were applied to evaluate the models. Delong test was used to compare the AUC of the models.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Six parameters including RBC (10<sup>12</sup>/L), RDW-CV (%), IG (%), NE-WZ, LY-WX, and LY-WZ were screened out by LASSO to construct the model. Among the three models, the AUC of RF model in the training set, validation set, and test set were 0.956, 0.892, and 0.875, which were higher than those of LR model (0.901, 0.849, and 0.858) and SVM model (0.929, 0.868, and 0.846). Delong test showed that there were significant differences among the models in the training set, no significant differences in the validation set, and significant differences only between SVM and RF models in the test set. The calibration curve and DCA showed that the three models had good validity and feasibility, and the RF model performed best.</p>\n </section>\n \n <section>\n \n <h3> Conclusion</h3>\n \n <p>The proposed RF model may be a useful auxiliary tool for rapid screening of MM patients.</p>\n </section>\n </div>","PeriodicalId":14120,"journal":{"name":"International Journal of Laboratory Hematology","volume":"46 5","pages":"918-926"},"PeriodicalIF":2.2000,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Construction of the prediction model for multiple myeloma based on machine learning\",\"authors\":\"Jiangying Cai, Zhenhua Liu, Yingying Wang, Wanxia Yang, Zhipeng Sun, Chongge You\",\"doi\":\"10.1111/ijlh.14324\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Introduction</h3>\\n \\n <p>The global burden of multiple myeloma (MM) is increasing every year. Here, we have developed machine learning models to provide a reference for the early detection of MM.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>A total of 465 patients and 150 healthy controls were enrolled in this retrospective study. Based on the variable screening strategy of least absolute shrinkage and selection operator (LASSO), three prediction models, logistic regression (LR), support vector machine (SVM), and random forest (RF), were established combining complete blood count (CBC) and cell population data (CPD) parameters in the training set (210 cases), and were verified in the validation set (90 cases) and test set (165 cases). The performance of each model was analyzed using receiver operating characteristic (ROC) curve, calibration curves, and decision curve analysis (DCA). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC) were applied to evaluate the models. Delong test was used to compare the AUC of the models.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>Six parameters including RBC (10<sup>12</sup>/L), RDW-CV (%), IG (%), NE-WZ, LY-WX, and LY-WZ were screened out by LASSO to construct the model. Among the three models, the AUC of RF model in the training set, validation set, and test set were 0.956, 0.892, and 0.875, which were higher than those of LR model (0.901, 0.849, and 0.858) and SVM model (0.929, 0.868, and 0.846). Delong test showed that there were significant differences among the models in the training set, no significant differences in the validation set, and significant differences only between SVM and RF models in the test set. The calibration curve and DCA showed that the three models had good validity and feasibility, and the RF model performed best.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusion</h3>\\n \\n <p>The proposed RF model may be a useful auxiliary tool for rapid screening of MM patients.</p>\\n </section>\\n </div>\",\"PeriodicalId\":14120,\"journal\":{\"name\":\"International Journal of Laboratory Hematology\",\"volume\":\"46 5\",\"pages\":\"918-926\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Laboratory Hematology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/ijlh.14324\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"HEMATOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Laboratory Hematology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ijlh.14324","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEMATOLOGY","Score":null,"Total":0}
Construction of the prediction model for multiple myeloma based on machine learning
Introduction
The global burden of multiple myeloma (MM) is increasing every year. Here, we have developed machine learning models to provide a reference for the early detection of MM.
Methods
A total of 465 patients and 150 healthy controls were enrolled in this retrospective study. Based on the variable screening strategy of least absolute shrinkage and selection operator (LASSO), three prediction models, logistic regression (LR), support vector machine (SVM), and random forest (RF), were established combining complete blood count (CBC) and cell population data (CPD) parameters in the training set (210 cases), and were verified in the validation set (90 cases) and test set (165 cases). The performance of each model was analyzed using receiver operating characteristic (ROC) curve, calibration curves, and decision curve analysis (DCA). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the ROC curve (AUC) were applied to evaluate the models. Delong test was used to compare the AUC of the models.
Results
Six parameters including RBC (1012/L), RDW-CV (%), IG (%), NE-WZ, LY-WX, and LY-WZ were screened out by LASSO to construct the model. Among the three models, the AUC of RF model in the training set, validation set, and test set were 0.956, 0.892, and 0.875, which were higher than those of LR model (0.901, 0.849, and 0.858) and SVM model (0.929, 0.868, and 0.846). Delong test showed that there were significant differences among the models in the training set, no significant differences in the validation set, and significant differences only between SVM and RF models in the test set. The calibration curve and DCA showed that the three models had good validity and feasibility, and the RF model performed best.
Conclusion
The proposed RF model may be a useful auxiliary tool for rapid screening of MM patients.
期刊介绍:
The International Journal of Laboratory Hematology provides a forum for the communication of new developments, research topics and the practice of laboratory haematology.
The journal publishes invited reviews, full length original articles, and correspondence.
The International Journal of Laboratory Hematology is the official journal of the International Society for Laboratory Hematology, which addresses the following sub-disciplines: cellular analysis, flow cytometry, haemostasis and thrombosis, molecular diagnostics, haematology informatics, haemoglobinopathies, point of care testing, standards and guidelines.
The journal was launched in 2006 as the successor to Clinical and Laboratory Hematology, which was first published in 1979. An active and positive editorial policy ensures that work of a high scientific standard is reported, in order to bridge the gap between practical and academic aspects of laboratory haematology.