{"title":"Diagnostic accuracy of pleural effusion biomarkers for malignant pleural mesothelioma: a machine learning analysis","authors":"Y. Niu, Zhi-De Hu","doi":"10.21037/JLPM-20-90","DOIUrl":null,"url":null,"abstract":"Background: Some studies have investigated the diagnostic accuracy of pleural effusion (PE) soluble mesothelin-related peptide (SMRP), cytokeratin 19 fragment (CYFRA 21-1), and carcinoembryonic antigen (CEA) for malignant pleural mesothelioma (MPM). However, whether their combination can improve the diagnostic accuracy for MPM remains unclear. Methods: In this post hoc analysis, 188 subjects, with 27 being diagnosed with MPM, were randomly categorized into training (n=90) and test (n=98) cohorts. We evaluated the diagnostic accuracy of combinational use of PE CEA, SMRP, and CYFRA 21-1 with machine learning approaches, including logistic regression model, linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), gradient boosting machine (GBM), and random forest. Sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were used to measure an index test’s diagnostic accuracy. Results: The AUC of the logistic regression model (0.97) was significantly higher than that of CEA (0.75), SMRP (0.86), and CYFRA 21-1 (0.78). The AUCs of MARS, KNN, GBM, and random forest were comparable to that of a single biomarker. Conclusions: Logistic regression model is a useful machine learning algorithmic approaches to improve the diagnostic accuracy of CEA, SMRP, and CYFRA 21-1. While other machine learning algorithmic strategies (MARS, KNN, GBM, and random forest) cannot improve these biomarkers’ diagnostic accuracy.","PeriodicalId":92408,"journal":{"name":"Journal of laboratory and precision medicine","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of laboratory and precision medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21037/JLPM-20-90","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Background: Some studies have investigated the diagnostic accuracy of pleural effusion (PE) soluble mesothelin-related peptide (SMRP), cytokeratin 19 fragment (CYFRA 21-1), and carcinoembryonic antigen (CEA) for malignant pleural mesothelioma (MPM). However, whether their combination can improve the diagnostic accuracy for MPM remains unclear. Methods: In this post hoc analysis, 188 subjects, with 27 being diagnosed with MPM, were randomly categorized into training (n=90) and test (n=98) cohorts. We evaluated the diagnostic accuracy of combinational use of PE CEA, SMRP, and CYFRA 21-1 with machine learning approaches, including logistic regression model, linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), k-nearest neighbor (KNN), gradient boosting machine (GBM), and random forest. Sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (AUC) were used to measure an index test’s diagnostic accuracy. Results: The AUC of the logistic regression model (0.97) was significantly higher than that of CEA (0.75), SMRP (0.86), and CYFRA 21-1 (0.78). The AUCs of MARS, KNN, GBM, and random forest were comparable to that of a single biomarker. Conclusions: Logistic regression model is a useful machine learning algorithmic approaches to improve the diagnostic accuracy of CEA, SMRP, and CYFRA 21-1. While other machine learning algorithmic strategies (MARS, KNN, GBM, and random forest) cannot improve these biomarkers’ diagnostic accuracy.