{"title":"Text Mining on Hospital Stay Durations and Management of Sickle Cell Disease Patients","authors":"Mohammed Gollapalli, Latifa Alabdullatif, Farah Alsuwayeh, Moodhi Aljouali, Alhanoof Alhunief, Zaina Batook","doi":"10.1109/CICN56167.2022.10008265","DOIUrl":null,"url":null,"abstract":"Sickle cell disease (SCD) is a genetic blood disorder characterized by clumping of red blood cells, preventing blood and oxygen from reaching all parts of the body. SCD disease is very common in Sub-Saharan Africa, the Mediterranean basin, and the eastern regions of Saudi Arabia due to high consanguineous marriage practices. Patients are frequently admitted due to the prevalence of multiple organ damage among SCD patients as a result of repeated vascular occlusion, resulting in a large amount of medical notes recorded by doctors and nurses during each clinical trial. In this study, 12 years of SCD patient de-identified data (2018–2020) were obtained officially from the hospital and experimented with in relation to SCD patient medical notes. We used a text mining framework to analyze and predict the length of stay (LoS) of SCD patients using three machine learning (ML) models: XGBoost, Decision Tree, and KNN. The most frequently occurring words were extracted from 62,847 SCD medical screening records using text mining. Furthermore, feature models were created to investigate the effect of increasing or decreasing the number of terms on model performance. The XGBoost algorithm produced the best results, with 94.3% accuracy, while the other algorithms produced results of 93.5% for Decision Tree and 90.7% for KNN. The findings suggest that predicting the length of stay of SCD patients is highly feasible, allowing for better utilization of medical personnel and resources.","PeriodicalId":287589,"journal":{"name":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th International Conference on Computational Intelligence and Communication Networks (CICN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CICN56167.2022.10008265","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Sickle cell disease (SCD) is a genetic blood disorder characterized by clumping of red blood cells, preventing blood and oxygen from reaching all parts of the body. SCD disease is very common in Sub-Saharan Africa, the Mediterranean basin, and the eastern regions of Saudi Arabia due to high consanguineous marriage practices. Patients are frequently admitted due to the prevalence of multiple organ damage among SCD patients as a result of repeated vascular occlusion, resulting in a large amount of medical notes recorded by doctors and nurses during each clinical trial. In this study, 12 years of SCD patient de-identified data (2018–2020) were obtained officially from the hospital and experimented with in relation to SCD patient medical notes. We used a text mining framework to analyze and predict the length of stay (LoS) of SCD patients using three machine learning (ML) models: XGBoost, Decision Tree, and KNN. The most frequently occurring words were extracted from 62,847 SCD medical screening records using text mining. Furthermore, feature models were created to investigate the effect of increasing or decreasing the number of terms on model performance. The XGBoost algorithm produced the best results, with 94.3% accuracy, while the other algorithms produced results of 93.5% for Decision Tree and 90.7% for KNN. The findings suggest that predicting the length of stay of SCD patients is highly feasible, allowing for better utilization of medical personnel and resources.