Yuan Zhou, Thomas K Swoboda, Zehao Ye, Michael Barbaro, Jake Blalock, Danny Zheng, Hao Wang
{"title":"使用机器学习算法预测急诊科糖尿病患者的患者门户使用情况。","authors":"Yuan Zhou, Thomas K Swoboda, Zehao Ye, Michael Barbaro, Jake Blalock, Danny Zheng, Hao Wang","doi":"10.14740/jocmr4862","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Different machine learning (ML) technologies have been applied in healthcare systems with diverse applications. We aimed to determine the model feasibility and accuracy of predicting patient portal use among diabetic patients by using six different ML algorithms. In addition, we also compared model performance accuracy with the use of only essential variables.</p><p><strong>Methods: </strong>This was a single-center retrospective observational study. From March 1, 2019 to February 28, 2020, we included all diabetic patients from the study emergency department (ED). The primary outcome was the status of patient portal use. A total of 18 variables consisting of patient sociodemographic characteristics, ED and clinic information, and patient medical conditions were included to predict patient portal use. Six ML algorithms (logistic regression, random forest (RF), deep forest, decision tree, multilayer perception, and support vector machine) were used for such predictions. During the initial step, ML predictions were performed with all variables. Then, the essential variables were chosen via feature selection. Patient portal use predictions were repeated with only essential variables. The performance accuracies (overall accuracy, sensitivity, specificity, and area under receiver operating characteristic curve (AUC)) of patient portal predictions were compared.</p><p><strong>Results: </strong>A total of 77,977 unique patients were placed in our final analysis. Among them, 23.4% (18,223) patients were diabetic mellitus (DM). Patient portal use was found in 26.9% of DM patients. Overall, the accuracy of predicting patient portal use was above 80% among five out of six ML algorithms. The RF outperformed the others when all variables were used for patient portal predictions (accuracy 0.9876, sensitivity 0.9454, specificity 0.9969, and AUC 0.9712). When only eight essential variables were chosen, RF still outperformed the others (accuracy 0.9876, sensitivity 0.9374, specificity 0.9932, and AUC 0.9769).</p><p><strong>Conclusion: </strong>It is possible to predict patient portal use outcomes when different ML algorithms are used with fair performance accuracy. However, with similar prediction accuracies, the use of feature selection techniques can improve the interpretability of the model by addressing the most relevant features.</p>","PeriodicalId":15431,"journal":{"name":"Journal of Clinical Medicine Research","volume":"15 3","pages":"133-138"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/bf/85/jocmr-15-133.PMC10079369.pdf","citationCount":"0","resultStr":"{\"title\":\"Using Machine Learning Algorithms to Predict Patient Portal Use Among Emergency Department Patients With Diabetes Mellitus.\",\"authors\":\"Yuan Zhou, Thomas K Swoboda, Zehao Ye, Michael Barbaro, Jake Blalock, Danny Zheng, Hao Wang\",\"doi\":\"10.14740/jocmr4862\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Different machine learning (ML) technologies have been applied in healthcare systems with diverse applications. We aimed to determine the model feasibility and accuracy of predicting patient portal use among diabetic patients by using six different ML algorithms. In addition, we also compared model performance accuracy with the use of only essential variables.</p><p><strong>Methods: </strong>This was a single-center retrospective observational study. From March 1, 2019 to February 28, 2020, we included all diabetic patients from the study emergency department (ED). The primary outcome was the status of patient portal use. A total of 18 variables consisting of patient sociodemographic characteristics, ED and clinic information, and patient medical conditions were included to predict patient portal use. Six ML algorithms (logistic regression, random forest (RF), deep forest, decision tree, multilayer perception, and support vector machine) were used for such predictions. During the initial step, ML predictions were performed with all variables. Then, the essential variables were chosen via feature selection. Patient portal use predictions were repeated with only essential variables. The performance accuracies (overall accuracy, sensitivity, specificity, and area under receiver operating characteristic curve (AUC)) of patient portal predictions were compared.</p><p><strong>Results: </strong>A total of 77,977 unique patients were placed in our final analysis. Among them, 23.4% (18,223) patients were diabetic mellitus (DM). Patient portal use was found in 26.9% of DM patients. Overall, the accuracy of predicting patient portal use was above 80% among five out of six ML algorithms. The RF outperformed the others when all variables were used for patient portal predictions (accuracy 0.9876, sensitivity 0.9454, specificity 0.9969, and AUC 0.9712). When only eight essential variables were chosen, RF still outperformed the others (accuracy 0.9876, sensitivity 0.9374, specificity 0.9932, and AUC 0.9769).</p><p><strong>Conclusion: </strong>It is possible to predict patient portal use outcomes when different ML algorithms are used with fair performance accuracy. However, with similar prediction accuracies, the use of feature selection techniques can improve the interpretability of the model by addressing the most relevant features.</p>\",\"PeriodicalId\":15431,\"journal\":{\"name\":\"Journal of Clinical Medicine Research\",\"volume\":\"15 3\",\"pages\":\"133-138\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/bf/85/jocmr-15-133.PMC10079369.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Clinical Medicine Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14740/jocmr4862\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Medicine Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14740/jocmr4862","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Machine Learning Algorithms to Predict Patient Portal Use Among Emergency Department Patients With Diabetes Mellitus.
Background: Different machine learning (ML) technologies have been applied in healthcare systems with diverse applications. We aimed to determine the model feasibility and accuracy of predicting patient portal use among diabetic patients by using six different ML algorithms. In addition, we also compared model performance accuracy with the use of only essential variables.
Methods: This was a single-center retrospective observational study. From March 1, 2019 to February 28, 2020, we included all diabetic patients from the study emergency department (ED). The primary outcome was the status of patient portal use. A total of 18 variables consisting of patient sociodemographic characteristics, ED and clinic information, and patient medical conditions were included to predict patient portal use. Six ML algorithms (logistic regression, random forest (RF), deep forest, decision tree, multilayer perception, and support vector machine) were used for such predictions. During the initial step, ML predictions were performed with all variables. Then, the essential variables were chosen via feature selection. Patient portal use predictions were repeated with only essential variables. The performance accuracies (overall accuracy, sensitivity, specificity, and area under receiver operating characteristic curve (AUC)) of patient portal predictions were compared.
Results: A total of 77,977 unique patients were placed in our final analysis. Among them, 23.4% (18,223) patients were diabetic mellitus (DM). Patient portal use was found in 26.9% of DM patients. Overall, the accuracy of predicting patient portal use was above 80% among five out of six ML algorithms. The RF outperformed the others when all variables were used for patient portal predictions (accuracy 0.9876, sensitivity 0.9454, specificity 0.9969, and AUC 0.9712). When only eight essential variables were chosen, RF still outperformed the others (accuracy 0.9876, sensitivity 0.9374, specificity 0.9932, and AUC 0.9769).
Conclusion: It is possible to predict patient portal use outcomes when different ML algorithms are used with fair performance accuracy. However, with similar prediction accuracies, the use of feature selection techniques can improve the interpretability of the model by addressing the most relevant features.