Xia Wang, Mei Zhang, Chuan Li, Chengyao Jia, Xijie Yu, He He
{"title":"Performance and efficiency of machine learning models in analyzing capillary serum protein electrophoresis.","authors":"Xia Wang, Mei Zhang, Chuan Li, Chengyao Jia, Xijie Yu, He He","doi":"10.1016/j.cca.2025.120165","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objective: </strong>Serum protein electrophoresis (SPEP) plays a critical role in diagnosing diseases associated with M-proteins. However, its clinical application is limited by a heavy reliance on experienced experts.</p><p><strong>Methods: </strong>A dataset comprising 85,026 SPEP outcomes was utilized to develop artificial intelligence diagnostic models for the classification and localization of M-proteins. These models were trained and validated using three data features, and their performance was evaluated using comprehensive metrics, including sensitivity, positive predictive value (PPV), specificity, negative predictive value (NPV), F1 score, accuracy, area under the receiver operating characteristic curve (AUC), Matthews correlation coefficient (MCC), and Intersection over Union (IoU). The best-performing machine learning (ML) and deep learning (DL) models were further tested on a separate dataset of 1,079 samples. The localization ability of the DL model was compared against three clinical experts.</p><p><strong>Results: </strong>Among the four ML models, the extreme gradient boosting (XGB) model achieved the best performance, with MCC, AUC, F1 score, sensitivity, specificity, accuracy, PPV, and NPV of 0.847, 0.903, 0.875, 0.822, 0.985, 0.951, 0.934, and 0.955, respectively. Different feature extraction methods significantly influenced model performance. The DL models outperformed the ML models in comprehensive performance. The U-Net combined with Transformer model demonstrated localization ability comparable to that of clinical experts, achieving sensitivity, specificity, accuracy, PPV, NPV, F1 score, AUC, MCC, and IoU of 0.947, 0.984, 0.976, 0.938, 0.986, 0.942, 0.966, 0.927, and 0.877, respectively.</p><p><strong>Conclusion: </strong>The U-Net combined with the Transformer model demonstrated expert-level performance in M-protein classification and localization, achieving an accuracy of 0.976 and an IoU of 0.877. This exceptional performance highlights the potential of this combined model for automating clinical SPEP workflows.</p>","PeriodicalId":10205,"journal":{"name":"Clinica Chimica Acta","volume":" ","pages":"120165"},"PeriodicalIF":3.2000,"publicationDate":"2025-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinica Chimica Acta","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.cca.2025.120165","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL LABORATORY TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objective: Serum protein electrophoresis (SPEP) plays a critical role in diagnosing diseases associated with M-proteins. However, its clinical application is limited by a heavy reliance on experienced experts.
Methods: A dataset comprising 85,026 SPEP outcomes was utilized to develop artificial intelligence diagnostic models for the classification and localization of M-proteins. These models were trained and validated using three data features, and their performance was evaluated using comprehensive metrics, including sensitivity, positive predictive value (PPV), specificity, negative predictive value (NPV), F1 score, accuracy, area under the receiver operating characteristic curve (AUC), Matthews correlation coefficient (MCC), and Intersection over Union (IoU). The best-performing machine learning (ML) and deep learning (DL) models were further tested on a separate dataset of 1,079 samples. The localization ability of the DL model was compared against three clinical experts.
Results: Among the four ML models, the extreme gradient boosting (XGB) model achieved the best performance, with MCC, AUC, F1 score, sensitivity, specificity, accuracy, PPV, and NPV of 0.847, 0.903, 0.875, 0.822, 0.985, 0.951, 0.934, and 0.955, respectively. Different feature extraction methods significantly influenced model performance. The DL models outperformed the ML models in comprehensive performance. The U-Net combined with Transformer model demonstrated localization ability comparable to that of clinical experts, achieving sensitivity, specificity, accuracy, PPV, NPV, F1 score, AUC, MCC, and IoU of 0.947, 0.984, 0.976, 0.938, 0.986, 0.942, 0.966, 0.927, and 0.877, respectively.
Conclusion: The U-Net combined with the Transformer model demonstrated expert-level performance in M-protein classification and localization, achieving an accuracy of 0.976 and an IoU of 0.877. This exceptional performance highlights the potential of this combined model for automating clinical SPEP workflows.
期刊介绍:
The Official Journal of the International Federation of Clinical Chemistry and Laboratory Medicine (IFCC)
Clinica Chimica Acta is a high-quality journal which publishes original Research Communications in the field of clinical chemistry and laboratory medicine, defined as the diagnostic application of chemistry, biochemistry, immunochemistry, biochemical aspects of hematology, toxicology, and molecular biology to the study of human disease in body fluids and cells.
The objective of the journal is to publish novel information leading to a better understanding of biological mechanisms of human diseases, their prevention, diagnosis, and patient management. Reports of an applied clinical character are also welcome. Papers concerned with normal metabolic processes or with constituents of normal cells or body fluids, such as reports of experimental or clinical studies in animals, are only considered when they are clearly and directly relevant to human disease. Evaluation of commercial products have a low priority for publication, unless they are novel or represent a technological breakthrough. Studies dealing with effects of drugs and natural products and studies dealing with the redox status in various diseases are not within the journal''s scope. Development and evaluation of novel analytical methodologies where applicable to diagnostic clinical chemistry and laboratory medicine, including point-of-care testing, and topics on laboratory management and informatics will also be considered. Studies focused on emerging diagnostic technologies and (big) data analysis procedures including digitalization, mobile Health, and artificial Intelligence applied to Laboratory Medicine are also of interest.