Carolina Gonzalez-Canas , Gustavo A. Valencia-Zapata , Ana Maria Estrada Gomez , Zachary Hass
{"title":"Assessing the impact on quality of prediction and inference from balancing in multilevel logistic regression","authors":"Carolina Gonzalez-Canas , Gustavo A. Valencia-Zapata , Ana Maria Estrada Gomez , Zachary Hass","doi":"10.1016/j.health.2024.100359","DOIUrl":"10.1016/j.health.2024.100359","url":null,"abstract":"<div><p>The primary goal of this research is to examine the impact of balancing data on the prediction quality and inference in multilevel logistic regression models. Logistic regression is a valuable approach for modeling binary outcomes expected in health applications. The class imbalance problem, where one of the two outcome categories occurs much more often than the other, is common in healthcare data, such as when modeling the risk factors for rare diseases. The issue is particularly relevant for medical data that contains individual measurements and other data sources measured at a geographic region level, such as environmental risk factors. For this work, both prediction and model interpretation are of interest. A simulation model is proposed to test the impact of balancing strategies on the logistic multilevel model's parameter estimation, inference, and predictive performance. The simulated information emulates characteristics of a Gestational Diabetes Mellitus (GDM) dataset from Indiana's Medicaid program. Several datasets were simulated with varying levels of complexity, involving the balance of the outcome variable and predictors. These datasets exhibited high- or low-frequency occurrences in specific intersections of variables, often called ‘cells.’ The impact of the balancing strategies on prediction and inference was assessed using different techniques, such as the Equivalence (TOST) Test, power analysis, and predictive measures. To the best of our knowledge, this is the first research that explores the impact of using balanced samples on coefficient estimation and prediction measures when using logistic multilevel modeling, finding evidence about the benefits of using balanced samples in this context.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100359"},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000613/pdfft?md5=61d70749e6aeada54ee254cabcd3c429&pid=1-s2.0-S2772442524000613-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142117349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative analysis of machine learning algorithms with tree-structured parzen estimator for liver disease prediction","authors":"Rakibul Islam, Azrin Sultana, MD. Nuruzzaman Tuhin","doi":"10.1016/j.health.2024.100358","DOIUrl":"10.1016/j.health.2024.100358","url":null,"abstract":"<div><p>The liver is one of the most essential organs in the body, which helps with metabolism and keeping the body healthy. Successful treatments and better patient outcomes depend on early and correct Liver Disease (LD) diagnosis and identification. This study proposes a system for predicting the LD by combining the techniques of Machine Learning (ML) algorithms that include the Decision Tree, Random Forest, Extra Tree Classifier (ETC), LightGBM, and Adaboost, with the Tree-Structured Parzen Estimator (TPE) method for hyperparameter tuning. No previous literature research has utilized ML algorithms with TPE to predict LD. For this research, the Indian Liver Patients’ Dataset with 583 instances and 11 attributes was used. In the pre-processing of the data, techniques such as upsampling have been utilized to address the class imbalance problem. Normalization has been employed to scale the dataset, and feature selection has been applied to choose important features. The proposed model has been analyzed and compared using a 10-fold cross-validation process, with various evaluation metrics including accuracy, precision, recall, and F1-score. The model proposed in this study achieved the best level of accuracy while employing the ETC with the TPE approach, with a recorded accuracy of 95.8%.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100358"},"PeriodicalIF":0.0,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000601/pdfft?md5=3aa72f3755c5377eba838fab77bd6aa3&pid=1-s2.0-S2772442524000601-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Malmquist fuzzy data envelopment analysis model for performance evaluation of rural healthcare systems","authors":"Vishal Chaubey , Deena Sunil Sharanappa , Kshitish Kumar Mohanta , Rajkumar Verma","doi":"10.1016/j.health.2024.100357","DOIUrl":"10.1016/j.health.2024.100357","url":null,"abstract":"<div><p>The primary purpose of this article is to measure the relative efficiency and productivity change over time in rural healthcare systems in the presence of fuzzy data. First, a novel ranking function based on the lower and upper bounds of alpha-cut of the trapezoidal fuzzy numbers (TrFNs) is proposed to compare the TrFNs. The suggested ranking technique is used to construct the fuzzy data envelopment analysis (FDEA), Malmquist fuzzy DEA (Mal-FDEA), and undesirable Malmquist fuzzy DEA (UN-Mal-FDEA ) models. The proposed models evaluate the efficiency and productivity of decision-making units (DMUs) when the input and output data are given in the form of TrFNs. In addition, a case study of the rural healthcare system in a developing country has been considered to demonstrate the applicability of the developed models. The work considers number of sub-centers (SCs), the number of primary health centers (PHCs), the number of community health centers (CHCs), nursing Staff at PHCs, an auxiliary nurse and midwives (ANM) at SCs, doctors at PHCs, pharmacists at PHCs, laboratory technicians at PHCs, radiographers at CHCs, and specialists at CHCs as input parameters and average population covered by CHCs, average village covered by CHCs, number of patients, and infant mortality rates as output parameters to analyze the performance of the rural healthcare systems. We show the UN-Mal-FDEA model has a higher production value than the Mal-FDEA model. The results of our proposed models enable us to recognize inefficiencies that states may rectify without compromising healthcare quality.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100357"},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000595/pdfft?md5=c60ff4997d73b3069e87498e704b3717&pid=1-s2.0-S2772442524000595-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An optimal control model for monkeypox transmission dynamics with vaccination and immunity loss following recovery","authors":"O.A. Adepoju, H.O. Ibrahim","doi":"10.1016/j.health.2024.100355","DOIUrl":"10.1016/j.health.2024.100355","url":null,"abstract":"<div><p>The viral illness known as monkeypox causes symptoms such a rash that can appear on the hands, feet, chest, face, and lips or near the genitalia. This study presents a mathematical model for the kinetics of monkeypox transmission with vaccination and immunity loss following recovery. The theories of positivity and boundedness are used to analyze the model’s well-posedness. The next generation matrix is used to determine the model’s basic reproduction number. The model’s equilibrium points are discovered. We demonstrate that the disease-free equilibrium was locally asymptotically stable. The center manifold theory is used to establish the bifurcation analysis. The impact of the parameters related to the fundamental reproduction number <span><math><msub><mrow><mi>R</mi></mrow><mrow><mn>0</mn></mrow></msub></math></span> is investigated using the normalized forward sensitivity index. In addition, the model is expanded to incorporate time-dependent management of preventing interaction with contaminated rodents, avoiding contact with contaminated people, wearing personal protective equipment, and reducing rodent populations by utilizing an integrated pest management strategy. The model’s qualitative analysis is supported by numerical simulation.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100355"},"PeriodicalIF":0.0,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000571/pdfft?md5=c42d8831c6521c0f5e1b1f9045af04e9&pid=1-s2.0-S2772442524000571-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141630149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Najmeh Nabavizadeh , Vahid Kayvanfar , Majid Rafiee
{"title":"A mixed integer linear programming model for quarantine-based home healthcare scheduling under uncertainty","authors":"Najmeh Nabavizadeh , Vahid Kayvanfar , Majid Rafiee","doi":"10.1016/j.health.2024.100356","DOIUrl":"https://doi.org/10.1016/j.health.2024.100356","url":null,"abstract":"<div><p>Home healthcare companies (HHC) have emerged as vital alternatives to traditional hospitals, particularly in meeting the healthcare needs of individuals within the comfort of their homes. The COVID-19 pandemic has amplified the significance of HHC services, offering a crucial alternative for patients and the elderly to follow quarantine protocols while receiving essential healthcare at home. Consequently, HHC companies must align their planning strategies with the World Health Organization (WHO) health guidelines. This research introduces a Mixed Integer Linear Programming (MILP) model tailored for home healthcare services during COVID-19, aiming to ensure strict adherence to quarantine protocols while enhancing service efficiency and quality. The proposed vehicle routing problem with pickup/delivery and time window formulation incorporates critical elements such as patient and caregiver classification, work and break regulations adherence, workload balancing, and multi-depot capabilities. The model addresses uncertain demand and service times through a stochastic programming approach to enhance practicality. K-means clustering is applied to streamline scenarios, with a sensitivity analysis determining the optimal number of clusters. Additionally, measures intrinsic to stochastic programming, such as the Expected Value of Perfect Information (EVPI) and Value of Stochastic Solution (VSS), are computed for comprehensive analysis.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100356"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000583/pdfft?md5=6ba4d6452530c556eb5e1f481a1aa965&pid=1-s2.0-S2772442524000583-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141594444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predictive interpretable analytics models for forecasting healthcare costs using open healthcare data","authors":"A. Ravishankar Rao , Raunak Jain , Mrityunjai Singh , Rahul Garg","doi":"10.1016/j.health.2024.100351","DOIUrl":"https://doi.org/10.1016/j.health.2024.100351","url":null,"abstract":"<div><p>Healthcare expenditure, a considerable proportion of national budgets, has risen rapidly. Consequently, considerable research is devoted to controlling healthcare costs. Many efforts are underway to improve medical price transparency. Price transparency will help patients become better informed, allowing them to shop for care they can afford, eventually leading to efficiency in healthcare markets. This first requires medical pricing data to be made available publicly. Since the raw pricing data can be large and cover multiple conditions, it is necessary to provide an engine to process the data to facilitate its usage and understanding. We recommend creating computational models that predict healthcare costs for various patient conditions and demographics. Patients and providers can interrogate the underlying data to understand the variation of healthcare costs concerning medical conditions and demographic variables of interest, including age. We demonstrate our approach by creating predictive models using recent machine learning techniques. We analyzed anonymous patient data from the New York State Statewide Planning and Research Cooperative System, consisting of 2.34 million records from 2019. We built models to predict costs from over two dozen patient variables, including diagnosis codes, severity of illness, age, and other demographic variables. We investigated three models: regression, decision trees, and random forests. These models are explainable. We analyzed features to determine those that were predictive of total costs. We determined that the diagnosis code, severity of illness, and length of stay were good predictors of total costs, whereas race and gender are not useful in predicting total costs. We obtained the best performance using a catboost regressor, which yielded an R2 score of 0.85, better than the values reported in the literature.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100351"},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000534/pdfft?md5=627ca7cad502b1be2f4f25cc21192d35&pid=1-s2.0-S2772442524000534-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141541941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative study of machine learning models with LASSO and SHAP feature selection for breast cancer prediction","authors":"Md. Shazzad Hossain Shaon, Tasmin Karim, Md. Shahriar Shakil, Md. Zahid Hasan","doi":"10.1016/j.health.2024.100353","DOIUrl":"https://doi.org/10.1016/j.health.2024.100353","url":null,"abstract":"<div><p>In recent decades, breast cancer has become the most prevalent type of cancer that impacts women in the world, which shows a significant risk to the death rates of women. Early identification of breast cancer might drastically decrease patient mortality and greatly improve the chance of an effective treatment. In modern times, machine learning models have become crucial for classifying cancer and strengthening both the accuracy and efficiency of diagnostic and medical treatment strategies. Therefore, this study is focused on early detection of breast cancer using a variety of machine learning algorithms and desires to identify the most effective feature selection process with an amalgamated dataset. Initially, we evaluated five traditional models and two meta-models on separate datasets. To find the most valuable features, the study used the Least Absolute Shrinkage and Selection Operator (LASSO) as well as SHapley Additive exPlanations (SHAP) selection methods and analyzed them through a wide range of performance regulations. Additionally, we applied these models to the combined dataset and observed that the mergeddataset was significantly beneficial for breast cancer diagnosis. After analyzing the feature selection strategies, it was demonstrated that the majority of models performed more accurately when utilizing SHAP methodologies. Notably, three traditional models and two meta-classifiers obtained an accuracy of 99.82%, demonstrating superior performance compared to state-of-the-art methods. This advancement holds a crucial role as it lays the foundation for refining diagnostic tools and enhancing the progression of medical science in this field.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100353"},"PeriodicalIF":0.0,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000558/pdfft?md5=86753ff6e5dca7c27f447a4a08fa5813&pid=1-s2.0-S2772442524000558-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141484808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Flávio de Freitas Almeida , Lásara Fabrícia Rodrigues , Luiz Ricardo Pinto , Francisco Carlos Cardoso de Campos
{"title":"An integrated location–allocation model for reducing disparities and increasing accessibility to public health screening centers","authors":"João Flávio de Freitas Almeida , Lásara Fabrícia Rodrigues , Luiz Ricardo Pinto , Francisco Carlos Cardoso de Campos","doi":"10.1016/j.health.2024.100349","DOIUrl":"https://doi.org/10.1016/j.health.2024.100349","url":null,"abstract":"<div><p>The tests for tracking diseases in newborns available through the National Neonatal Screening Program of the Brazilian Unified Health Care System cover six diseases. Mass spectrometer equipment is needed to expand and more efficiently and effectively detect new diseases. However, only four neonatal screening centers have the equipment capable of carrying out the extended test, and the expansion of health service capacity should consider both the rationalization of costs and the comprehensiveness and accessibility of care to the population. This study uses analytics to analyze and estimate the cost of centralized or distributed logistics networks and the level of service to perform the expanded test for newborns throughout Brazil. We evaluate the accessibility of the current infrastructure for the neonatal screening program and propose a novel location–allocation model to create a more integrated infrastructure for reducing disparities and increase the accessibility to neonatal screening services.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100349"},"PeriodicalIF":0.0,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000510/pdfft?md5=8d14260b36fde15e3bb57df49d356689&pid=1-s2.0-S2772442524000510-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141439169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Md. Monirul Islam , Shahriar Hassan , Sharmin Akter , Ferdaus Anam Jibon , Md. Sahidullah
{"title":"A comprehensive review of predictive analytics models for mental illness using machine learning algorithms","authors":"Md. Monirul Islam , Shahriar Hassan , Sharmin Akter , Ferdaus Anam Jibon , Md. Sahidullah","doi":"10.1016/j.health.2024.100350","DOIUrl":"https://doi.org/10.1016/j.health.2024.100350","url":null,"abstract":"<div><p>Our emotional, psychological, and social well-being are all parts of our mental health, influencing our thoughts, emotions, and behaviors. Mental health also influences how we respond to stress, interact with others, and make good or bad decisions. There has been growing interest in the use of machine learning for the early detection of mental illness. This study reviews the machine learning models, algorithms, and applications for the early detection of mental disease, particularly emphasizing the data modalities. We further propose a comprehensive methodology for assessing mental health that synergistically combines social media monitoring, data analytics from wearable devices, verbal polls, and individualized support. We provide an overview of the field’s current state, highlight the potential benefits and challenges of using machine learning in mental health care, and a new taxonomy of mental disorders issues based on five domains of data types. We review existing research on using machine learning to detect and treat mental illness and discuss the implications for future research. Finally, the value of this work lies in its potential to provide a fast and accurate method for predicting the mental health status of a person, which may assist in the diagnosis and treatment of mental illness.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100350"},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000522/pdfft?md5=bc1cb3cc91aa0634c506d50a66bd2d34&pid=1-s2.0-S2772442524000522-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141435122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qin Shao , Mounika Polavarapu , Lafleur Small , Shipra Singh , Quoc Nguyen , Kevin Shao
{"title":"A longitudinal mixed effects model for assessing mortality trends during vaccine rollout","authors":"Qin Shao , Mounika Polavarapu , Lafleur Small , Shipra Singh , Quoc Nguyen , Kevin Shao","doi":"10.1016/j.health.2024.100347","DOIUrl":"10.1016/j.health.2024.100347","url":null,"abstract":"<div><p>The rapid spread of coronavirus disease 2019 (COVID-19) initially presented unprecedented challenges for clinicians, policymakers, and healthcare systems, as there was limited evidence on the efficacy of various control measures. This study endeavors to provide a detailed and comprehensive overview of the global progression of the COVID-19 mortality in the context of vaccine rollout, utilizing public surveillance data from 145 countries sourced from the World Health Organization and the World Bank. The primary focus is to analyze shifts in the trend of new COVID-19 mortality worldwide before and after the introduction of COVID-19 vaccines. To achieve this, we propose a longitudinal mixed effects model aimed at elucidating the relationship between mortality trend and vaccination rollout, alongside other pertinent covariates. Our modeling approach seeks to accommodate variations in the timing of COVID-19 vaccine rollout among countries, as well as the correlation of observations from within the same country. Our findings highlight the significant impact of new cases, cardiovascular death rate, senior population, stringency index, and reproduction rate on mortality. However, we find that the impact of vaccination is not statistically significant, as evidenced by a relatively large <span><math><mi>p</mi></math></span>-value. Furthermore, the study reveals substantial disparities in mortality rates among countries across four income groups.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100347"},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442524000492/pdfft?md5=ae79d48a8a53e7a4841d3c82370b0bf0&pid=1-s2.0-S2772442524000492-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141401659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}