Rownak Ara Rasul , Promy Saha , Diponkor Bala , S.M. Rakib Ul Karim , Md. Ibrahim Abdullah , Bishwajit Saha
{"title":"An evaluation of machine learning approaches for early diagnosis of autism spectrum disorder","authors":"Rownak Ara Rasul , Promy Saha , Diponkor Bala , S.M. Rakib Ul Karim , Md. Ibrahim Abdullah , Bishwajit Saha","doi":"10.1016/j.health.2023.100293","DOIUrl":"https://doi.org/10.1016/j.health.2023.100293","url":null,"abstract":"<div><p>Autistic Spectrum Disorder (ASD) is a neurological disease characterized by difficulties with social interaction, communication, and repetitive activities. While its primary origin lies in genetics, early detection is crucial, and leveraging machine learning offers a promising avenue for a faster and more cost-effective diagnosis. This study employs diverse machine learning methods to identify crucial ASD traits, aiming to enhance and automate the diagnostic process. We study eight state-of-the-art classification models to determine their effectiveness in ASD detection. We evaluate the models using accuracy, precision, recall, specificity, F1-score, area under the curve (AUC), kappa, and log loss metrics to find the best classifier for these binary datasets. Among all the classification models, for the children dataset, the SVM and LR models achieve the highest accuracy of 100% and for the adult dataset, the LR model produces the highest accuracy of 97.14%. Our proposed ANN model provides the highest accuracy of 94.24% for the new combined dataset when hyperparameters are precisely tuned for each model. As almost all classification models achieve high accuracy which utilize true labels, we become interested in delving into five popular clustering algorithms to understand model behavior in scenarios without true labels. We calculate Normalized Mutual Information (NMI), Adjusted Rand Index (ARI), and Silhouette Coefficient (SC) metrics to select the best clustering models. Our evaluation finds that spectral clustering outperforms all other benchmarking clustering models in terms of NMI and ARI metrics while demonstrating comparability to the optimal SC achieved by k-means. The implemented code is available at <span>GitHub</span><svg><path></path></svg>.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100293"},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001600/pdfft?md5=e0fd6cd67baa47c33181f21a1d4a70e4&pid=1-s2.0-S2772442523001600-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139434016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mohammad Mihrab Chowdhury , Ragib Shahariar Ayon , Md Sakhawat Hossain
{"title":"An investigation of machine learning algorithms and data augmentation techniques for diabetes diagnosis using class imbalanced BRFSS dataset","authors":"Mohammad Mihrab Chowdhury , Ragib Shahariar Ayon , Md Sakhawat Hossain","doi":"10.1016/j.health.2023.100297","DOIUrl":"https://doi.org/10.1016/j.health.2023.100297","url":null,"abstract":"<div><p>Diabetes is a prevalent chronic condition that poses significant challenges to early diagnosis and identifying at-risk individuals. Machine learning plays a crucial role in diabetes detection by leveraging its ability to process large volumes of data and identify complex patterns. However, imbalanced data, where the number of diabetic cases is substantially smaller than non-diabetic cases, complicates the identification of individuals with diabetes using machine learning algorithms. This study focuses on predicting whether a person is at risk of diabetes, considering the individual’s health and socio-economic conditions while mitigating the challenges posed by imbalanced data. We employ several data augmentation techniques, such as oversampling (Synthetic Minority Over Sampling for Nominal Data, i.e.SMOTE-N), undersampling (Edited Nearest Neighbor, i.e. ENN), and hybrid sampling techniques (SMOTE-Tomek and SMOTE-ENN) on training data before applying machine learning algorithms to minimize the impact of imbalanced data. Our study sheds light on the significance of carefully utilizing data augmentation techniques without any data leakage to enhance the effectiveness of machine learning algorithms. Moreover, it offers a complete machine learning structure for healthcare practitioners, from data obtaining to machine learning prediction, enabling them to make informed decisions.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100297"},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001648/pdfft?md5=cbb15d1b9b72127ef6f0b213ad40bae0&pid=1-s2.0-S2772442523001648-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139108378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An enhanced multi-scale deep convolutional orchard capsule neural network for multi-modal breast cancer detection","authors":"Sangeeta Parshionikar , Debnath Bhattacharyya","doi":"10.1016/j.health.2023.100298","DOIUrl":"https://doi.org/10.1016/j.health.2023.100298","url":null,"abstract":"<div><p>Breast cancer is the second-leading cause of cancer death in women. Breast cells develop into malignant, cancerous lumps, the first signs of breast cancer. Breast cancer can be discovered by the automated diagnostic system when it is still too little to be found by conventional medical methods. Early breast cancers identified with automated screening and diagnosis technologies are generally treatable. This study proposes an enhanced multi-scale deep Convolutional Capsule Neural Network (CapsNet) optimized with Orchard Optimization Algorithm for breast cancer detection. The proposed system consists of preprocessing, feature extraction, segmentation, and classification process. Two input images are taken initially: the Breast Cancer Histopathology Images dataset and the Infrared Thermal Images dataset. The quality of the collected data is improved, and unwanted noises are removed. The features are extracted to segment the image to derive a Region of Interest for effectively segmenting the affected region. Finally, the images are classified as benign/malignant for histopathology images and healthy/cancer for thermal images. The proposed CapsNet is implemented in Python, run for 200 epochs, and compared with existing methods in terms of evaluation metrics. The result shows that the proposed CapsNet attained 99.74 % accuracy, 0.0482 binary entropy loss on the Breast Cancer Histopathology Image dataset and 97 % accuracy, 0.2081 binary entropy loss on the Infrared Thermal Images dataset while incrementing the epochs at each level.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100298"},"PeriodicalIF":0.0,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S277244252300165X/pdfft?md5=b1bbe6a96ab03f4797d9cf402b245a2b&pid=1-s2.0-S277244252300165X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139100914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Afroza Sultana , Md Tawhid Islam Opu , Farruk Ahmed , Md Shafiul Alam
{"title":"A novel machine learning algorithm for finger movement classification from surface electromyogram signals using welch power estimation","authors":"Afroza Sultana , Md Tawhid Islam Opu , Farruk Ahmed , Md Shafiul Alam","doi":"10.1016/j.health.2023.100296","DOIUrl":"https://doi.org/10.1016/j.health.2023.100296","url":null,"abstract":"<div><p>Electromyogram (EMG) signal monitoring is an effective method for controlling the movements of a prosthetic limb. The classification of the EMG pattern of various finger motions in upper-arm amputees has drawn much attention in recent years to develop algorithms that provide adequate accuracy. However, due to the complexity of EMG data, movement detection is a challenging task. Therefore, an effective model is needed that can accurately process, analyze, and classify various hand and finger movements. This paper proposes a novel algorithm for processing and classifying 15 finger movements from surface EMG signals based on Welch power estimation from frequency analysis. Five time-domain features are extracted and trained with a machine learning classifier to classify 15 single fingers and combined finger gestures from eight healthy subjects. The experimental results show 92.30 % classification accuracy considering data from eight channels which was improved to 94.15 % after selecting two channels as dominating. For performance evaluation, 10-fold cross-validation is used during classification. We demonstrate an average accuracy of 92.35 % with 25 % test data.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100296"},"PeriodicalIF":0.0,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001636/pdfft?md5=4ed0e07f8bd5d341ea9781566c335c1d&pid=1-s2.0-S2772442523001636-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139100913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Tummers , H. Tobi , C. Catal , B. Tekinerdogan , B. Schalk , G. Leusink
{"title":"A health information systems architecture study in intellectual disability care: Commonalities and variabilities","authors":"J. Tummers , H. Tobi , C. Catal , B. Tekinerdogan , B. Schalk , G. Leusink","doi":"10.1016/j.health.2023.100295","DOIUrl":"https://doi.org/10.1016/j.health.2023.100295","url":null,"abstract":"<div><p>Care providers in intellectual disability care use various health information systems (HIS) to document the care they provide. This generates a substantial amount of structured and unstructured data with significant potential for reuse, which is currently underexploited. To enhance data reuse, it is important to understand the architecture of health information systems in intellectual disability care, including their commonalities and variabilities (differences), as well as to identify related privacy and security issues. Our study adopts a multiple-case study approach, examining the architectures of four health information systems in the Netherlands. We conducted interviews with seven stakeholders from four HISs and reviewed multiple documents concerning system infrastructure. We identified commonalities and differences between these systems and outlined the primary challenges regarding privacy and security for data reuse. For each HIS, four architectural views were developed: a context diagram, decomposition view, layered view, and deployment view. The study discusses crucial security and privacy aspects for data reuse in intellectual disability care and highlights several challenges that must be addressed to unlock the full potential of this data. This research provides initial guidelines for overcoming these challenges.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100295"},"PeriodicalIF":0.0,"publicationDate":"2023-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001624/pdfft?md5=ecf6675c4ee1d78f11193ec9ae651477&pid=1-s2.0-S2772442523001624-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139100882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashish Acharya , Sanjoy Mahato , Nikhilesh Sil , Animesh Mahata , Supriya Mukherjee , Sanat Kumar Mahato , Banamali Roy
{"title":"An intuitionistic fuzzy differential equation approach for the lake water and sediment phosphorus model","authors":"Ashish Acharya , Sanjoy Mahato , Nikhilesh Sil , Animesh Mahata , Supriya Mukherjee , Sanat Kumar Mahato , Banamali Roy","doi":"10.1016/j.health.2023.100294","DOIUrl":"https://doi.org/10.1016/j.health.2023.100294","url":null,"abstract":"<div><p>Intuitionistic fuzzy sets cannot consider the degree of indeterminacy (i.e., the degree of hesitation). This study presents an intuitionistic fuzzy differential equation approach for the lake water and sediment phosphorus model. We examine the proposed model by assuming generalized trapezoidal intuitionistic fuzzy numbers for the initial condition. Feasible equilibrium points, along with their stability criteria, are evaluated. We describe the characteristics of intuitionistic fuzzy solutions and clarify the difference between strong and weak intuitionistic fuzzy solutions. Numerical simulations are performed in MATLAB to validate the model results.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100294"},"PeriodicalIF":0.0,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001612/pdfft?md5=e15c005e52f8ed0bf87df0d41f792549&pid=1-s2.0-S2772442523001612-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138839144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An integrated data mining algorithms and meta-heuristic technique to predict the readmission risk of diabetic patients","authors":"Masoomeh Zeinalnezhad , Saman Shishehchi","doi":"10.1016/j.health.2023.100292","DOIUrl":"https://doi.org/10.1016/j.health.2023.100292","url":null,"abstract":"<div><p>Reducing hospital readmission rate is a significant challenge in the healthcare industry for managers and policymakers seeking to improve healthcare and lower costs. This study integrates data mining and meta-heuristic techniques to predict the early readmission probability of diabetic patients within 30 days of discharge. The research dataset was obtained from the UC Irvine Machine Learning Repository, including 101765 instances with 50 features representing patient and hospital outcomes, collected from 130 US hospitals. After data preprocessing, including cleansing, sampling, and normalization, a Chi-square analysis is done to confirm and rank the 20 identified factors affecting the readmission risk. As the algorithms' performance could vary based on the features’ characteristics, several classification algorithms, including a Random Forest (RF), Neural Network (NN), and Support Vector Machine (SVM), are employed. Moreover, the Genetic Algorithm (GA) is integrated into the SVM algorithm, called GA-SVM, for hyper-parameter tuning and increasing the prediction accuracy. The performance of the models was evaluated using accuracy, recall, precision, and f-measure metrics. The results indicate that the accuracy of RF, GA-SVM, SVM, and NN are calculated respectively as 74.04 %, 73.52 %, 72.40 %, and 70.44 %. Using GA to adjust c and gamma hyper-parameters led to a 1.12 % increase in SVM prediction accuracy. In response to increasing demand and considering poor hospital conditions, particularly during epidemics, these findings point out the potential benefits of a more tailored methodology in managing diabetic patients.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100292"},"PeriodicalIF":0.0,"publicationDate":"2023-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001594/pdfft?md5=6e5d6264cebd9b0add3578ecda515b60&pid=1-s2.0-S2772442523001594-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139100912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Kit Chaw , Sook Hui Chaw , Chai Hoong Quah , Shafrida Sahrani , Mei Choo Ang , Yanfeng Zhao , Tin Tin Ting
{"title":"A predictive analytics model using machine learning algorithms to estimate the risk of shock development among dengue patients","authors":"Jun Kit Chaw , Sook Hui Chaw , Chai Hoong Quah , Shafrida Sahrani , Mei Choo Ang , Yanfeng Zhao , Tin Tin Ting","doi":"10.1016/j.health.2023.100290","DOIUrl":"https://doi.org/10.1016/j.health.2023.100290","url":null,"abstract":"<div><p>Dengue is a common viral disease in tropical and subtropical countries. The clinical manifestation of dengue has a wide spectrum, from asymptomatic seroconversion to severe dengue infection. Severe dengue is defined as dengue with the presence of specific symptoms, including severe plasma leakage leading to shock or the accumulation of fluids with respiratory distress, severe bleeding, and severe organ impairment. Examining the progression of shock with the integration of patients’ physiological information and biochemical parameters would help in understanding the progression of the disease and early detection of shock. In this study, physiological patient data diagnosed with dengue are collected from a University Malaya Medical Centre’s electronic record. A prediction model learned from the measurement of a patient’s physiological data is the basis for effective treatment and prevention of shock development in critically ill patients. Hence, this study presents the predictive performance of machine learning algorithms to estimate the risk of shock development among dengue patients. Logistic regression, decision trees, support vector machines and neural networks are evaluated. Lastly, ensemble learnings of bagging and boosting are also applied to the weak learner to optimize performance. The experimental results show that the bagging algorithm outperforms other competing methods with a 14.5% improvement from the individual decision tree. The full blood count (FBC) specifically haemoglobin (Hb) on day 2 is found to be a strong predictor for severe dengue occurrence.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100290"},"PeriodicalIF":0.0,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001570/pdfft?md5=ed318907195dbb3ad3fcd7eff55ba46c&pid=1-s2.0-S2772442523001570-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138582390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A LinkedIn-based analysis of the U.S. dynamic adaptations in healthcare during the COVID-19 pandemic","authors":"Theodoros Daglis, Konstantinos P. Tsagarakis","doi":"10.1016/j.health.2023.100291","DOIUrl":"https://doi.org/10.1016/j.health.2023.100291","url":null,"abstract":"<div><p>Despite its side effects on the global environment, the pandemic has created business opportunities for healthcare. This work utilizes LinkedIn data to examine the features of U.S. healthcare companies that operate within a COVID-19 framework. Data from 304 companies in May 2022 and 333 companies in June 2023 from COVID-19-related companies with LinkedIn presence in the U.S. has been collected and analyzed. This study investigates the distinct characteristics of these companies through statistical measures and analysis at the state level. Some of these companies were established long before the pandemic but shifted their orientation toward COVID-19 in response to the crisis, while many others emerged explicitly due to the pandemic. These companies are primarily active in “Health, wellness and fitness,” “Hospital and healthcare,” Nonprofit organization and management,” “Medical practice,” and “Civic and Social organizations.” We show most companies and employees are located in California, and most followers are in the companies in Washington in the first and California in the second data mining.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100291"},"PeriodicalIF":0.0,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001582/pdfft?md5=cca8ef166e2e49d80d43042d34762bfd&pid=1-s2.0-S2772442523001582-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138770078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel convolutional neural network for identification of retinal layers using sliced optical coherence tomography images","authors":"Akshat Tulsani , Jeh Patel , Preetham Kumar , Veena Mayya , Pavithra K.C. , Geetha M. , Sulatha V. Bhandary , Sameena Pathan","doi":"10.1016/j.health.2023.100289","DOIUrl":"https://doi.org/10.1016/j.health.2023.100289","url":null,"abstract":"<div><p>Retinal imaging is crucial for observing the retina and accurately diagnosing pathological problems. Optical Coherence Tomography (OCT) has been a transformative breakthrough for developing high-resolution cross-sectional images. It is imperative to delineate the multiple layers of the retina for a proper diagnosis. A novel segmentation-based approach is introduced in this study to identify seven distinct layers of the retina using OCT images. The developed approach presents SliceOCTNet, a customized U-shaped Convolutional Neural Network (CNN) that introduces group normalization and intricate skip connections. Paired alongside a hybrid loss function, the SliceOCTNet outperformed most state-of-the-art approaches. The introduction of Group Normalization in SliceOCTNet stabilized the model and improved layer identification even when working with small datasets. The use of skip connections also contributed to an improvement in the spatial outlook of the model. Implementing a hybrid loss function addresses the class imbalance problem in the dataset. Duke University’s spectral-domain optical coherence tomography (SD-OCT) B-scan dataset of healthy and Diabetic Macular Edema (DME) afflicted patients was utilized to train and evaluate the SliceOCTNet. The model accurately recognizes the seven layers of the retina. It can achieve a high dice coefficient value of 0.941 and refine the segmentation process to a higher level of precision.</p></div>","PeriodicalId":73222,"journal":{"name":"Healthcare analytics (New York, N.Y.)","volume":"5 ","pages":"Article 100289"},"PeriodicalIF":0.0,"publicationDate":"2023-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772442523001569/pdfft?md5=9ad0b8302c3b2e9935316e85786b0565&pid=1-s2.0-S2772442523001569-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138570720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}