{"title":"An exploration of machine learning approaches for early Autism Spectrum Disorder detection","authors":"Nawshin Haque, Tania Islam, Md Erfan","doi":"10.1016/j.health.2024.100379","DOIUrl":"10.1016/j.health.2024.100379","url":null,"abstract":"<div><div>Autism Spectrum Disorder is a neurodevelopmental condition impacting an individual’s repetitive behaviours, social skills, verbal and nonverbal communication abilities, and capacity for acquiring new knowledge. Manifesting typically in early childhood, specifically between 6 months and 5 years, the symptoms of autism exhibit a progressive nature over time. This study explores the application of Logistic Regression, Support Vector Classifier, K-Nearest Neighbour, Decision Tree, and Random Forest for predicting Autism in children and toddlers by leveraging advancements in machine learning. The efficacy of these techniques is evaluated using publicly accessible datasets specific to both age groups. The findings indicate remarkable performance, with the toddler dataset achieving a mean Intersection over Union (mIoU) of 100<span><math><mtext>%</mtext></math></span> for Support Vector Classifier and 99.80<span><math><mtext>%</mtext></math></span> for Logistic Regression. Similarly, the children dataset demonstrates outstanding results, achieving an mIoU of 100<span><math><mtext>%</mtext></math></span> for Support Vector Classifier and 99.96<span><math><mtext>%</mtext></math></span> for Logistic Regression. Furthermore, all algorithms achieved 100<span><math><mtext>%</mtext></math></span> accuracy on the children (age 4–11) dataset collected from real-world sources. Logistic Regression, Random Forest, Support Vector Classifier, and Decision Tree attained 100<span><math><mtext>%</mtext></math></span> accuracy and mIoU with the real-world dataset. These results underscore the potential of machine learning in aiding the early detection of ASD in children and toddlers, offering promising avenues for future research and clinical applications.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100379"},"PeriodicalIF":0.0,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143169862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Viljami Männikkö , Juha Turunen , Heidi Åhman , Esa Harju
{"title":"A large-scale risk assessment and classification model for pneumococcus using Finnish national health data","authors":"Viljami Männikkö , Juha Turunen , Heidi Åhman , Esa Harju","doi":"10.1016/j.health.2025.100382","DOIUrl":"10.1016/j.health.2025.100382","url":null,"abstract":"<div><div><em>Streptococcus pneumoniae</em>, or pneumococcus, poses a significant health risk, particularly to infants, the elderly, and individuals with underlying medical conditions. In Finland, pneumococcal vaccination is part of the national immunization program, with vaccination provided to young children and only selected at-risk adult populations included. This study aims to leverage the Finnish national electronic health record system, Kanta, to analyze treatment histories and identify individuals at increased risk for disease to improve vaccination strategies. Kanta provides a comprehensive, nationwide database of patient treatment histories, which can be utilized to track individual risk factors and disease episodes. We analyzed health data from 96,200 Finnish residents with risk factors for pneumococcal disease following guidelines from the Finnish Institute for Health and Welfare and the World Health Organization. We prioritize vaccination for those at the greatest risk by categorizing individuals based on their identified risk factors. This study demonstrates the potential for using national health record data to conduct large-scale risk analyses, allowing for more targeted and efficient vaccination strategies. The novelty of our approach lies in the automatic identification of high-risk individuals, which can inform public health initiatives and enhance the monitoring of pneumococcal disease risk at a population level.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100382"},"PeriodicalIF":0.0,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A comparative assessment of machine learning models and algorithms for osteosarcoma cancer detection and classification","authors":"Amoakoh Gyasi-Agyei","doi":"10.1016/j.health.2024.100380","DOIUrl":"10.1016/j.health.2024.100380","url":null,"abstract":"<div><div>Osteosarcoma is a bone-forming tumor that is more common in children and young adults than in adults. Timely detection and classification of its type is crucial to its proper treatment and possible survival. Machine learning (ML) models trained on disease datasets are more effective in detection and classification than the conventional methods with hand-crafted features highly dependent on pathologists’ expertise. A publicly available raw osteosarcoma dataset was explored and then preprocessed using different combinations of data denoising techniques (including principal component analysis, mutual information gain, analysis of variance and Kendall’s rank correlation analysis) and data augmentation to <em>derive</em> seven different datasets. Using the seven derived datasets and eight ML algorithms, this study designed and performed an extensive comparative analysis of seven sets of ML models (altogether over 160 models) with their hyperparameters optimized using grid search. The performance differences between the learned ML models were then validated using repeated stratified 10-fold cross-validation and 5x2 cross-validation paired <em>t</em>-tests to select the best model for our task. The empirical model based on the extra trees algorithm and fitted to class-balanced dataset via random oversampling and multicollinearity removed via principal component analysis proved to be the best, as it detected and classified osteosarcoma cancer in 10 ms with 97.8% area under the receiver operating characteristics curve and acceptably low false alarm and misdetection. Thus, the proposed models can be cutting-edge techniques for automated detection and classification of osteosarcoma tumors to aid timely diagnosis, prognosis, and treatment.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100380"},"PeriodicalIF":0.0,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143169863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Agus Mansur , Ivan Darma Wangsa , Novrianty Rizky , Iwan Vanany
{"title":"An efficient blood supply chain network model with multiple echelons for managing outdated products","authors":"Agus Mansur , Ivan Darma Wangsa , Novrianty Rizky , Iwan Vanany","doi":"10.1016/j.health.2024.100377","DOIUrl":"10.1016/j.health.2024.100377","url":null,"abstract":"<div><div>This study examines the lack of coordination between blood production and inventories in the blood supply chain networks. Prior studies neglect to optimize operational costs through blood production, inventory, and waste. We propose a mixed-integer linear programming approach addressing multiple echelons, types of blood, and blood bag shelf lifetime. The model is developed by determining the facility locations, assigning regional blood banks, and allocating the right products. Indonesia's blood supply chain is used as a case study to evaluate the applicability of the proposed model using optimization software. A sensitivity analysis is performed on production rate and patient demand to assess how these factors affect the overall cost of expired products. The results show that the proposed method's total cost and expired products are 4.69%–5.60% and 4.71%–5.75%, respectively.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100377"},"PeriodicalIF":0.0,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143169864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An enhanced machine learning approach with stacking ensemble learner for accurate liver cancer diagnosis using feature selection and gene expression data","authors":"Amena Mahmoud , Eiko Takaoka","doi":"10.1016/j.health.2024.100373","DOIUrl":"10.1016/j.health.2024.100373","url":null,"abstract":"<div><div>Liver cancer is a significant global health concern, necessitating accurate and timely diagnosis for effective treatment. Machine learning approaches have emerged as promising tools for improving liver cancer classification using gene expression data in recent years. This study presents an advanced machine learning approach for liver cancer diagnosis using gene expression data, combining feature selection techniques with a stacking ensemble learning model. Our method addresses the challenges of high dimensionality and complex patterns in genomic data to improve diagnostic accuracy and interpretability. We employed a feature selection process to identify the most relevant gene expressions associated with liver cancer. This approach reduced the dimensionality of the data while preserving crucial biological information. The selected features were then used to train a stacking ensemble model, which combined multiple base learners, including Multi-Layer Perceptron (MLP), Random Forest (RF) model, K-nearest neighbor (KNN) model, and Support vector machine (SVM), with a meta-learner Extreme Gradient Boosting (Xgboost) model to make final predictions. The stacking ensemble achieved an accuracy of (97%), outperforming individual machine learning algorithms and traditional diagnostic methods. Furthermore, the model demonstrated high sensitivity (96.8%) and specificity (98.1%), crucial for early detection and minimizing false positives.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100373"},"PeriodicalIF":0.0,"publicationDate":"2024-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Syed Muhammad Salman Bukhari , Muhammad Hamza Zafar , Syed Kumayl Raza Moosavi , Majad Mansoor , Filippo Sanfilippo
{"title":"An integrated stacked convolutional neural network and the levy flight-based grasshopper optimization algorithm for predicting heart disease","authors":"Syed Muhammad Salman Bukhari , Muhammad Hamza Zafar , Syed Kumayl Raza Moosavi , Majad Mansoor , Filippo Sanfilippo","doi":"10.1016/j.health.2024.100374","DOIUrl":"10.1016/j.health.2024.100374","url":null,"abstract":"<div><div>Cardiovascular disease is the leading cause of death worldwide, including critical conditions such as blood vessel blockage, heart failure, and stroke. Accurate and early prediction of heart disease remains a significant challenge due to the complexity of symptoms and the variability of contributing factors. This study proposes a novel hybrid model integrating a Stacked Convolutional Neural Network (SCNN) with the Levy Flight-based Grasshopper Optimization Algorithm (LFGOA) to address this challenge. The SCNN provides robust feature extraction, while LFGOA enhances the model by optimizing hyperparameters, improving classification accuracy, and reducing overfitting. The proposed approach is evaluated using four publicly available heart disease datasets, each representing diverse clinical and demographic features. Compared to traditional classifiers, including Regression Trees, Support Vector Machine, Logistic Regression, K-Nearest Neighbors, and standard Neural Networks, the SCNN-LFGOA consistently outperforms these methods. The results highlight that the SCNN-LFGOA achieves an average accuracy of 99%, with significant improvements in specificity, sensitivity, and F1-Score, showcasing its adaptability and robustness across datasets. This study highlights the SCNN-LFGOA's potential as a transformative tool for early and accurate heart disease prediction, contributing to improved patient outcomes and more efficient healthcare resource utilization. By combining deep learning with an advanced optimization technique, this research introduces a scalable and effective solution to a critical healthcare problem.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"7 ","pages":"Article 100374"},"PeriodicalIF":0.0,"publicationDate":"2024-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimized early fusion of handcrafted and deep learning descriptors for voice pathology detection and classification","authors":"Roohum Jegan, R. Jayagowri","doi":"10.1016/j.health.2024.100369","DOIUrl":"10.1016/j.health.2024.100369","url":null,"abstract":"<div><div>This study presents an automated noninvasive voice disorder detection and classification approach using an optimized fusion of modified glottal source estimation and deep transfer learning neural network descriptors. A new set of modified descriptors based on a glottal source estimator and pre-trained Inception-ResNet-v2 convolutional neural network-based features are proposed for the speech disorder detection and classification task. The modified feature set is obtained using mel-cepstral coefficients, harmonic model, phase discrimination means, distortion deviation descriptors, conventional wavelet, and glottal source estimation features. Early descriptor-level fusion is employed in this study for performance enhancement-however, the fusion results in higher feature vector dimensionality. A nature-inspired slime mould algorithm is utilized to remove redundant and select the best discriminating features. Finally, the classification is performed using the K-nearest neighbor (KNN) classifier. The proposed algorithm was evaluated using extensive experiments with different feature combinations, with and without feature selection, and with two popular datasets: the Arabic Voice Pathology Database (AVPD) and the Saarbrucken Voice Database (SVD). We show that the proposed optimized fusion method attained an enhanced voice pathology detection accuracy of 98.46%, encompassing a wide spectrum of voice disorders on the SVD database. Furthermore, compared to traditional handcrafted and deep neural network-based techniques, the proposed method demonstrates competitive performance with fewer features.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100369"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142742998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-Chen Hu, Pelin Angin, Haiming Liu, Debnath Bhattacharyya
{"title":"e-Health and artificial intelligence: Emerging trends, models, and applications","authors":"Yu-Chen Hu, Pelin Angin, Haiming Liu, Debnath Bhattacharyya","doi":"10.1016/j.health.2024.100354","DOIUrl":"10.1016/j.health.2024.100354","url":null,"abstract":"","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100354"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143129681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kenneth J. Locey, Brian D. Stein, Ryan Schipfer, Brittnie Dotson, Leslie Klemp
{"title":"An open-source application for obtaining retrospective and prospective insights into overall hospital quality star ratings","authors":"Kenneth J. Locey, Brian D. Stein, Ryan Schipfer, Brittnie Dotson, Leslie Klemp","doi":"10.1016/j.health.2024.100371","DOIUrl":"10.1016/j.health.2024.100371","url":null,"abstract":"<div><div>Overall Hospital Quality Star Ratings (overall star ratings) are designed to assist healthcare consumers by summarizing dozens of hospital quality measures. These ratings are also used by hospitals to direct quality improvements and are often used in healthcare research. However, no analytical tools have been developed to provide insights into the data, measures, and scores of the overall star rating system. To this end, we developed a novel open-source application to provide retrospective insights, prospective estimates, and research-ready data. Users can 1) examine changes in hospital performance from 2021 onward, 2) recalculate overall star ratings based on hypothetical improvements, 3) download data for all hospitals included in the overall star rating system since 2021, and 4) obtain prospective estimates based on the overall star rating methodology and its data source (Care Compare). We demonstrate 99.6% accuracy when estimating overall star ratings six months prior to public release. Estimates of whether hospitals will retain their star rating are up to 90% accurate a year before public release. We discuss the use of our application in healthcare research and the potential for similar tools to be developed for other hospital rating and ranking systems.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100371"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143129683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A metafrontier and Malmquist productivity index approach for analyzing biased technological and efficiency change in Taiwanese traditional Chinese medicine","authors":"Kuan-Chen Chen , Hsiang-An Yu , Ming-Miin Yu","doi":"10.1016/j.health.2024.100372","DOIUrl":"10.1016/j.health.2024.100372","url":null,"abstract":"<div><div>This study assesses changes in resource productivity in traditional Chinese medicine (TCM) system across Taiwanese counties and cities from 2016 to 2019, stratifying the analysis by population densities. Employing a data envelopment analysis (DEA) metafrontier Malmquist productivity index model, this research relaxes Hicks' neutrality assumption of technical change, allowing for the measurement of biased technological change and technical gap ratio changes. The empirical findings reveal a decline in TCM system productivity, primarily attributed to reduced technological advancements. Notably, higher productivity changes were observed in counties and cities with lower population densities, contrasting with those having higher population densities, where productivity changes were limited. The results suggest that areas with lower population densities hold significant potential for technological enhancement, as evidenced by intergroup technology updates and technological leadership indices. Furthermore, the estimates of productivity change and technological bias underscore the inadequacy of assuming Hicks’ neutral technological change for analyzing TCM system productivity in Taiwan. These findings highlight the need for improved TCM system technology and innovation within the healthcare system to address the urban-rural gap effectively.</div></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"6 ","pages":"Article 100372"},"PeriodicalIF":0.0,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143129684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}