Predicting abnormal C-reactive protein level for improving utilization by deep neural network model

IF 3.7 2区医学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

International Journal of Medical Informatics Pub Date : 2024-11-26 DOI:10.1016/j.ijmedinf.2024.105726

Donghua Mo , Shilong Xiong , Tianxing Ji , Qiang Zhou , Qian Zheng

{"title":"Predicting abnormal C-reactive protein level for improving utilization by deep neural network model","authors":"Donghua Mo , Shilong Xiong , Tianxing Ji , Qiang Zhou , Qian Zheng","doi":"10.1016/j.ijmedinf.2024.105726","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>C-reactive protein (CRP) is an inflammatory biomarker frequently used in clinical practice. However, insufficient evidence-based ordering inevitably results in its overuse or underuse. This study aims to predict its normal and abnormal levels using the deep neural network (DNN) models, helping clinicians order this item more appropriately and intelligently.</div></div><div><h3>Methods</h3><div>We considered complete blood count (CBC) parameters as feature vectors and 10 mg/L as a cutoff value for CRP. Several models, including linear support vector classification, logistic regression, decision trees, random forests, and DNN, were developed based on a dataset of 53834 medical records to predict binary output. We externally validated DNN models on independent 20723 samples through discrimination, calibration curve, and decision curve analysis.</div></div><div><h3>Results</h3><div>DNN models has the best area under the receiver operating characteristic curves (AUC). Learning curves revealed that models’ AUC, balanced accuracy, and F1 score do not significantly and continuously improve following increasing data volume. In internal validation, the AUC, balanced accuracy, and the F1 score of 10 models were 0.818 (0.95 CI: 0.812-0.824), 0.741 (0.95 CI: 0.736-0.747), and 0.649 (0.95 CI: 0.643-0.656), respectively. These metrics were 0.817 (0.95 CI: 0.816-0.817), 0.741 (0.95 CI: 0.740-0.742), and 0.641 (0.95 CI: 0.640-0.642), respectively, in external validation. AUC and balanced accuracy shown no significant difference (P-values were 0.106 and 0.339). CRP10-C2 model has the lowest Brier score of 0.154, AUC of 0.818, and calibration curve formula of y=1.001x-0.010, which was identified as a target model to deploy in the app.</div></div><div><h3>Conclusions</h3><div>DNN models obtained moderate performance, surpassing baseline indices in distinguishing binary CRP levels. They are good generalizations and well-calibrated. The CRP-C2 model can enhance CRP utilization by informing the orders appropriately and can contribute to inflammatory diagnostics in primary health care where CBC is available, but the CRP test is inaccessible.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"195 ","pages":"Article 105726"},"PeriodicalIF":3.7000,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505624003897","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Background

C-reactive protein (CRP) is an inflammatory biomarker frequently used in clinical practice. However, insufficient evidence-based ordering inevitably results in its overuse or underuse. This study aims to predict its normal and abnormal levels using the deep neural network (DNN) models, helping clinicians order this item more appropriately and intelligently.

Methods

We considered complete blood count (CBC) parameters as feature vectors and 10 mg/L as a cutoff value for CRP. Several models, including linear support vector classification, logistic regression, decision trees, random forests, and DNN, were developed based on a dataset of 53834 medical records to predict binary output. We externally validated DNN models on independent 20723 samples through discrimination, calibration curve, and decision curve analysis.

Results

DNN models has the best area under the receiver operating characteristic curves (AUC). Learning curves revealed that models’ AUC, balanced accuracy, and F1 score do not significantly and continuously improve following increasing data volume. In internal validation, the AUC, balanced accuracy, and the F1 score of 10 models were 0.818 (0.95 CI: 0.812-0.824), 0.741 (0.95 CI: 0.736-0.747), and 0.649 (0.95 CI: 0.643-0.656), respectively. These metrics were 0.817 (0.95 CI: 0.816-0.817), 0.741 (0.95 CI: 0.740-0.742), and 0.641 (0.95 CI: 0.640-0.642), respectively, in external validation. AUC and balanced accuracy shown no significant difference (P-values were 0.106 and 0.339). CRP10-C2 model has the lowest Brier score of 0.154, AUC of 0.818, and calibration curve formula of y=1.001x-0.010, which was identified as a target model to deploy in the app.

Conclusions

DNN models obtained moderate performance, surpassing baseline indices in distinguishing binary CRP levels. They are good generalizations and well-calibrated. The CRP-C2 model can enhance CRP utilization by informing the orders appropriately and can contribute to inflammatory diagnostics in primary health care where CBC is available, but the CRP test is inaccessible.

查看原文本刊更多论文

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Medical Informatics 医学-计算机：信息系统

CiteScore

8.90

自引率

4.10%

发文量

217

审稿时长

42 days

期刊介绍： International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings. The scope of journal covers: Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.; Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc. Educational computer based programs pertaining to medical informatics or medicine in general; Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.