Diagnostic accuracy of pleural effusion biomarkers for malignant pleural mesothelioma: a machine learning analysis

IF 1.4

Journal of laboratory and precision medicine Pub Date : 2021-01-01 DOI:10.21037/JLPM-20-90

Y. Niu, Zhi-De Hu

{"title":"Diagnostic accuracy of pleural effusion biomarkers for malignant pleural mesothelioma: a machine learning analysis","authors":"Y. Niu, Zhi-De Hu","doi":"10.21037/JLPM-20-90","DOIUrl":null,"url":null,"abstract":"Background: Some studies have investigated the diagnostic accuracy of pleural effusion (PE) soluble mesothelin-related peptide (SMRP), cytokeratin 19 fragment (CYFRA 21-1), and carcinoembryonic antigen (CEA) for malignant pleural mesothelioma (MPM). However, whether their combination can improve the diagnostic accuracy for MPM remains unclear. Methods: In this post hoc analysis, 188 subjects, with 27 being diagnosed with MPM, were randomly categorized into training (n=90) and test (n=98) cohorts. We evaluated the diagnostic accuracy of combinational use of PE CEA, SMRP, and CYFRA 21-1 with machine learning approaches, including logistic regression model, linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), gradient boosting machine (GBM), and random forest. Sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were used to measure an index test’s diagnostic accuracy. Results: The AUC of the logistic regression model (0.97) was significantly higher than that of CEA (0.75), SMRP (0.86), and CYFRA 21-1 (0.78). The AUCs of MARS, KNN, GBM, and random forest were comparable to that of a single biomarker. Conclusions: Logistic regression model is a useful machine learning algorithmic approaches to improve the diagnostic accuracy of CEA, SMRP, and CYFRA 21-1. While other machine learning algorithmic strategies (MARS, KNN, GBM, and random forest) cannot improve these biomarkers’ diagnostic accuracy.","PeriodicalId":92408,"journal":{"name":"Journal of laboratory and precision medicine","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of laboratory and precision medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21037/JLPM-20-90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Background: Some studies have investigated the diagnostic accuracy of pleural effusion (PE) soluble mesothelin-related peptide (SMRP), cytokeratin 19 fragment (CYFRA 21-1), and carcinoembryonic antigen (CEA) for malignant pleural mesothelioma (MPM). However, whether their combination can improve the diagnostic accuracy for MPM remains unclear. Methods: In this post hoc analysis, 188 subjects, with 27 being diagnosed with MPM, were randomly categorized into training (n=90) and test (n=98) cohorts. We evaluated the diagnostic accuracy of combinational use of PE CEA, SMRP, and CYFRA 21-1 with machine learning approaches, including logistic regression model, linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), gradient boosting machine (GBM), and random forest. Sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were used to measure an index test’s diagnostic accuracy. Results: The AUC of the logistic regression model (0.97) was significantly higher than that of CEA (0.75), SMRP (0.86), and CYFRA 21-1 (0.78). The AUCs of MARS, KNN, GBM, and random forest were comparable to that of a single biomarker. Conclusions: Logistic regression model is a useful machine learning algorithmic approaches to improve the diagnostic accuracy of CEA, SMRP, and CYFRA 21-1. While other machine learning algorithmic strategies (MARS, KNN, GBM, and random forest) cannot improve these biomarkers’ diagnostic accuracy.

查看原文本刊更多论文

恶性胸膜间皮瘤胸膜积液生物标志物的诊断准确性:机器学习分析

背景:一些研究探讨了胸膜积液(PE)可溶性间皮素相关肽(SMRP)、细胞角蛋白19片段(CYFRA 21-1)和癌胚抗原(CEA)对恶性胸膜间皮瘤(MPM)的诊断准确性。然而，它们的结合是否能提高MPM的诊断准确性仍不清楚。方法:在这项事后分析中，188名受试者，其中27名被诊断为MPM，随机分为训练组(n=90)和测试组(n=98)。我们评估了PE CEA、SMRP和CYFRA 21-1与机器学习方法组合使用的诊断准确性，包括逻辑回归模型、线性判别分析(LDA)、多元自适应回归样条(MARS)、k近邻(KNN)、梯度增强机(GBM)和随机森林。灵敏度、特异性和受试者工作特征曲线下面积(AUC)被用来衡量指标试验的诊断准确性。结果:logistic回归模型的AUC(0.97)显著高于CEA(0.75)、SMRP(0.86)和CYFRA 21-1(0.78)。MARS、KNN、GBM和random forest的auc与单一生物标志物的auc相当。结论:Logistic回归模型是提高CEA、SMRP和CYFRA诊断准确率的有效机器学习算法方法21-1。而其他机器学习算法策略(MARS、KNN、GBM和随机森林)无法提高这些生物标志物的诊断准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of laboratory and precision medicine

CiteScore

1.70

自引率

0.00%

发文量