{"title":"Adaptive questionnaires for facilitating patient data entry in clinical decision support systems: methods and application to STOPP/START v2.","authors":"Lamy Jean-Baptiste, Mouazer Abdelmalek, Léguillon Romain, Lelong Romain, Darmoni Stéfan, Sedki Karima, Dubois Sophie, Falcoff Hector","doi":"10.1186/s12911-024-02742-6","DOIUrl":"10.1186/s12911-024-02742-6","url":null,"abstract":"<p><p>Clinical decision support systems are software tools that help clinicians to make medical decisions. However, their acceptance by clinicians is usually rather low. A known problem is that they often require clinicians to manually enter a lot of patient data, which is long and tedious. Existing solutions, such as the automatic data extraction from electronic health record, are not fully satisfying, because of low data quality and availability. In practice, many systems still include long questionnaire for data entry. In this paper, we propose an original solution to simplify patient data entry, using an adaptive questionnaire, i.e. a questionnaire that evolves during user interaction, showing or hiding questions dynamically. Considering a rule-based decision support systems, we designed methods for determining the relationships between rules and translating the system's clinical rules into display rules that determine the items to show in the questionnaire, and methods for determining the optimal order of priority among the items in the questionnaire. We applied this approach to a decision support system implementing STOPP/START v2, a guideline for managing polypharmacy. We show that it permits reducing by about two thirds the number of clinical conditions displayed in the questionnaire, both on clinical cases and real patient data. Presented to clinicians during focus group sessions, the adaptive questionnaire was found \"pretty easy to use\". In the future, this approach could be applied to other guidelines, and adapted for data entry by patients.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"326"},"PeriodicalIF":3.3,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11539734/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142582329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marie Ansoborlo, Christine Salpétrier, Louis-Romé Le Nail, Julien Herbet, Marc Cuggia, Philippe Rosset, Leslie Grammatico-Guillon
{"title":"Feasibility of automated surveillance of implantable devices in orthopaedics via clinical data warehouse: the Studio study.","authors":"Marie Ansoborlo, Christine Salpétrier, Louis-Romé Le Nail, Julien Herbet, Marc Cuggia, Philippe Rosset, Leslie Grammatico-Guillon","doi":"10.1186/s12911-024-02697-8","DOIUrl":"10.1186/s12911-024-02697-8","url":null,"abstract":"<p><strong>Background: </strong>Total hip, knee and shoulder arthroplasties (THKSA) are increasing due to expanding demands in ageing population. Material surveillance is important to prevent severe complications involving implantable medical devices (IMD) by taking appropriate preventive measures. Automating the analysis of patient and IMD features could benefit physicians and public health policies, allowing early issue detection and decision support. The study aimed to demonstrate the feasibility of automated cohorting of patients with a first arthroplasty in two hospital data warehouses (HDW) in France.</p><p><strong>Methods: </strong>The study included adult patients with an arthroplasty between 2010 and 2019 identified by 2 data sources: hospital discharge and pharmacy. Selection was based on the health insurance thesaurus of IMDs in the pharmacy database: 1,523 distinct IMD references for primary THSKA. In the hospital discharge database, 22 distinct procedures for native joint replacement allowing a matching between IMD and surgical procedure of each patient selected. A program to automate information extraction was implemented in the 1st hospital data warehouse using natural language processing (NLP) on pharmacy labels, then it was then applied to the 2nd hospital.</p><p><strong>Results: </strong>The e-cohort was built with a first arthroplasty for THKSA performed in 7,587 patients with a mean age of 67.4 years, and a sex ratio of 0.75. The cohort involved 4,113 hip, 2,630 knee and 844 shoulder surgical patients. Obesity, cardio-vascular diseases and hypertension were the most frequent medical conditions.</p><p><strong>Discussion: </strong>The implementation of an e-cohort for material surveillance will be easily workable over HDWs France wild. Using NLP as no international IMD mapping exists to study IMD, our approach aims to close the gap between conventional epidemiological cohorting tools and bigdata approach.</p><p><strong>Conclusion: </strong>This pilot study demonstrated the feasibility of an e-cohort of orthopaedic devices using clinical data warehouses. The IMD and patient features could be studied with intra-hospital follow-up and will help analysing the infectious and unsealing complications.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"324"},"PeriodicalIF":3.3,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11533334/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142575013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of a Wilms tumor risk model based on machine learning and identification of cuproptosis-related clusters.","authors":"Jingru Huang, Yong Li, Xiaotan Pan, Jixiu Wei, Qiongqian Xu, Yin Zheng, Peng Chen, Jiabo Chen","doi":"10.1186/s12911-024-02716-8","DOIUrl":"10.1186/s12911-024-02716-8","url":null,"abstract":"<p><strong>Background: </strong>Cuproptosis, a recently identified type of programmed cell death triggered by copper, has mechanisms in Wilms tumor (WT) that are not yet fully understood. This research focuses on examining the link between WT and Cuproptosis-related genes (CRGs), with the goal of developing a predictive model for WT.</p><p><strong>Methods: </strong>Four gene expression datasets related to WT were sourced from the GEO database. Subsequently, expression profiles of CRGs were extracted for differential analysis and immune infiltration studies. Utilizing 105 WT samples, clusters related to Cuproptosis were identified. This involved analyzing associated immune cell infiltration and conducting functional enrichment analysis. Disease-characteristic genes were pinpointed using weighted gene co-expression network analysis. Finally, the WT risk prediction model was constructed by four machine learning methods: random forest, support vector machine (SVM), generalized linear and extreme gradient strength model. The best-performing machine learning model was chosen, and a nomogram was created. The effectiveness of this predictive model was validated using methods such as the calibration curve, decision curve analysis, and by appiying it to the TARGET-GTEx dataset.</p><p><strong>Results: </strong>Thirteen differentially expressed Cuproptosis-related genes were identified. The infiltration level of CD8 + T cells in WT children was lower than that in Normal tissue (NT) children, and the level of M0 infiltration of macrophages and T follicular helper cells was higher than that in NT children. In addition, two clusters of cuproptosis-related WT were identified. Enrichment analysis results indicated that genes in cluster 2 were primarily involved in cell division, nuclear division regulation, DNA biosynthesis process, ubiquitin-mediated proteolysis. The SVM model was judged to be the optimal model using 5 genes. Its accuracy was confirmed through a calibration curve and decision curve analysis, demonstrating satisfactory performance on the TARGET-GTEx validation dataset. Additional analysis revealed that these five genes exhibited high expression in both the TARGET-GTEx validation dataset and sequencing data.</p><p><strong>Conclusion: </strong>This research established a link between WT and Cuproptosis. It developed a predictive model for assessing the risk of WT and pinpointed five key genes associated with the disease.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"325"},"PeriodicalIF":3.3,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11536559/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142575003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jared M Wohlgemut, Erhan Pisirir, Rebecca S Stoner, Zane B Perkins, William Marsh, Nigel R M Tai, Evangelia Kyrimi
{"title":"A scoping review, novel taxonomy and catalogue of implementation frameworks for clinical decision support systems.","authors":"Jared M Wohlgemut, Erhan Pisirir, Rebecca S Stoner, Zane B Perkins, William Marsh, Nigel R M Tai, Evangelia Kyrimi","doi":"10.1186/s12911-024-02739-1","DOIUrl":"10.1186/s12911-024-02739-1","url":null,"abstract":"<p><strong>Background: </strong>The primary aim of this scoping review was to synthesise key domains and sub-domains described in existing clinical decision support systems (CDSS) implementation frameworks into a novel taxonomy and demonstrate most-studied and least-studied areas. Secondary objectives were to evaluate the frequency and manner of use of each framework, and catalogue frameworks by implementation stage.</p><p><strong>Methods: </strong>A scoping review of Pubmed, Scopus, Web of Science, PsychInfo and Embase was conducted on 12/01/2022, limited to English language, including 2000-2021. Each framework was categorised as addressing one or multiple stages of implementation: design and development, evaluation, acceptance and integration, and adoption and maintenance. Key parts of each framework were grouped into domains and sub-domains.</p><p><strong>Results: </strong>Of 3550 titles identified, 58 papers were included. The most-studied implementation stage was acceptance and integration, while the least-studied was design and development. The three main framework uses were: for evaluating adoption, for understanding attitudes toward implementation, and for framework validation. The most frequently used framework was the Consolidated Framework for Implementation Research.</p><p><strong>Conclusions: </strong>Many frameworks have been published to overcome barriers to CDSS implementation and offer guidance towards successful adoption. However, for co-developers, choosing relevant frameworks may be a challenge. A taxonomy of domains addressed by CDSS implementation frameworks is provided, as well as a description of their use, and a catalogue of frameworks listed by the implementation stages they address. Future work should ensure best practices for CDSS design are adequately described, and existing frameworks are well-validated. An emphasis on collaboration between clinician and non-clinician affected parties may help advance the field.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"323"},"PeriodicalIF":3.3,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11531160/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142564048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of femoral head collapse in osteonecrosis using deep learning segmentation and radiomics texture analysis of MRI.","authors":"Shihua Gao, Haoran Zhu, Moshan Wen, Wei He, Yufeng Wu, Ziqi Li, Jiewei Peng","doi":"10.1186/s12911-024-02722-w","DOIUrl":"10.1186/s12911-024-02722-w","url":null,"abstract":"<p><strong>Background: </strong>Femoral head collapse is a critical pathological change and is regarded as turning point in disease progression in osteonecrosis of the femoral head (ONFH). In this study, we aim to build an automatic femoral head collapse prediction pipeline for ONFH based on magnetic resonance imaging (MRI) radiomics.</p><p><strong>Methods: </strong>In the segmentation model development dataset, T1-weighted MRI of 222 hips from two hospitals were retrospectively collected and randomly split into training (n = 190) and test (n = 32) sets. In the prognosis prediction model development dataset, 206 hips were also retrospectively collected from two hospitals and divided into training set (n = 155) and external test set (n = 51) according to data source. A deep learning model for automatic lesion segmentation was trained with nnU-Net, from which three-dimensional regions of interest were segmented and a total of 107 radiomics features were extracted. After intra-class correlation coefficients screening, feature correlation coefficient screening and Least Absolute Shrinkage and Selection Operator regression feature selection, a machine learning model for ONFH prognosis prediction was trained with Logistic Regression (LR) and Light Gradient Boosting Machine (LightGBM) algorithm.</p><p><strong>Results: </strong>The segmentation model achieved an average dice similarity coefficient of 0.848 and an average 95% Hausdorff distance of 3.794 in the test set, compared to the manual segmentation results. After feature selection, nine radiomics features were included in the prognosis prediction model. External test showed that the LightGBM model exhibited acceptable predictive performance. The area under the curve (AUC) of the prediction model was 0.851 (95% CI: 0.7268-0.9752), with an accuracy of 0.765, sensitivity of 0.833, and specificity of 0.727. Decision curve analysis showed that the LightGBM model exhibited favorable clinical utility.</p><p><strong>Conclusion: </strong>This study presents an automated pipeline for predicting femoral head collapse in ONFH with acceptable performance. Further research is necessary to determine the clinical applicability of this radiomics-based approach and to assess its potential to assist in treatment decision-making for ONFH.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"320"},"PeriodicalIF":3.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11526660/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing the reliability of medical resource demand models in the context of COVID-19.","authors":"Kimberly Dautel, Ephraim Agyingi, Pras Pathmanathan","doi":"10.1186/s12911-024-02726-6","DOIUrl":"10.1186/s12911-024-02726-6","url":null,"abstract":"<p><strong>Background: </strong>Numerous medical resource demand models have been created as tools for governments or hospitals, aiming to predict the need for crucial resources like ventilators, hospital beds, personal protective equipment (PPE), and diagnostic kits during crises such as the COVID-19 pandemic. However, the reliability of these demand models remains uncertain.</p><p><strong>Methods: </strong>Demand models typically consist of two main components: hospital use epidemiological models that predict hospitalizations or daily admissions, and a demand calculator that translates the outputs of the epidemiological model into predictions for resource usage. We conducted separate analyses to evaluate each of these components. In the first analysis, we validated various hospital use epidemiological models using a recent validation framework designed for epidemiological models. This allowed us to quantify the accuracy of the models in predicting critical aspects such as the date and magnitude of local COVID-19 peaks, among other factors. In the second analysis, we evaluated a range of demand calculators for ventilators, medical gowns, and COVID-19 test kits. To achieve this, we decoupled these demand calculators from the underlying epidemiological models and provided ground truth data for their inputs. This approach enabled a direct comparison of the demand calculators, comparing them against each other and actual usage data when available. The code is available at https://doi.org/10.5281/zenodo.13712387 .</p><p><strong>Results: </strong>Performance varied greatly across the epidemiological models, with greater variability in COVID-19 hospital use predictions than for COVID-19 deaths as analyzed previously. Some models did not have any peaks. Among those that did, the models under-estimated date of peak approximately as often as they over-estimated, but were more likely to under-estimate magnitude of peak, with typical relative errors around 50%. Regarding demand calculator predictions, there was significant variability, including five-fold differences in predictions for gown models. Validation against actual or surrogate usage data illustrated the potential value of demand models while demonstrating their limitations.</p><p><strong>Conclusions: </strong>The emerging field of demand modeling holds promise in averting medical resource shortages during future public health emergencies. However, achieving this potential necessitates focused efforts on standardization, transparency, and rigorous model validation before placing reliance on demand models in critical public health decision-making.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"322"},"PeriodicalIF":3.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11529025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Informatics assessment of COVID-19 data collection: an analysis of UK Biobank questionnaire data.","authors":"Craig S Mayer","doi":"10.1186/s12911-024-02743-5","DOIUrl":"10.1186/s12911-024-02743-5","url":null,"abstract":"<p><strong>Background: </strong>There have been many efforts to expand existing data collection initiatives to include COVID-19 related data. One program that expanded is UK Biobank, a large-scale research and biomedical data collection resource that added several COVID-19 related data fields including questionnaires (exposures and symptoms), viral testing, and serological data. This study aimed to analyze this COVID-19 data to understand how COVID-19 data was collected and how it can be used to attribute COVID-19 and analyze differences in cohorts and time periods.</p><p><strong>Methods: </strong>A cohort of COVID-19 infected individuals was defined from the UK Biobank population using viral testing, diagnosis, and self-reported data. Changes over time, from March 2020 to October 2021, in total case counts and changes in case counts by identification source (diagnosis from EHR, measurement from viral testing and self-reported from questionnaire) were also analyzed. For the questionnaires, an analysis of the structure and dynamics of the questionnaires was done which included the amount and type of questions asked, how often and how many individuals answered the questions and what responses were given. In addition, the amount of individuals who provided responses regarding different time segments covered by the questionnaire was calculated along with how often responses changed. The analysis included changes in population level responses over time. The analyses were repeated for COVID and non-COVID individuals and compared responses.</p><p><strong>Results: </strong>There were 62 042 distinct participants who had COVID-19, with 49 120 identified through diagnosis, 30 553 identified through viral testing and 934 identified through self-reporting, with many identified in multiple methods. This included vast changes in overall cases and distribution of case data source over time. 6 899 of 9 952 participants completing the exposure questionnaire responded regarding every time period covered by the questionnaire including large changes in response over time. The most common change came for employment situation, which was changed by 74.78% of individuals from the first to last time of asking. On a population level, there were changes as face mask usage increased each successive time period. There were decreases in nearly every COVID-19 symptom from the first to the second questionnaire. When comparing COVID to non-COVID participants, COVID participants were more commonly keyworkers (COVID: 33.76%, non-COVID: 15.00%) and more often lived with young people attending school (61.70%, 45.32%).</p><p><strong>Conclusion: </strong>To develop a robust cohort of COVID-19 participants from the UK Biobank population, multiple types of data were needed. The differences based on time and exposures show the important of comprehensive data capture and the utility of COVID-19 related questionnaire data.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"321"},"PeriodicalIF":3.3,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11529153/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Somayeh Ayalvari, Marjan Kaedi, Mohammadreza Sehhati
{"title":"A modified multiple-criteria decision-making approach based on a protein-protein interaction network to diagnose latent tuberculosis.","authors":"Somayeh Ayalvari, Marjan Kaedi, Mohammadreza Sehhati","doi":"10.1186/s12911-024-02668-z","DOIUrl":"10.1186/s12911-024-02668-z","url":null,"abstract":"<p><strong>Background: </strong>DNA microarrays provide informative data for transcriptional profiling and identifying gene expression signatures to help prevent progression of latent tuberculosis infection (LTBI) to active disease. However, constructing a prognostic model for distinguishing LTBI from active tuberculosis (ATB) is very challenging due to the noisy nature of data and lack of a generally stable analysis approach.</p><p><strong>Methods: </strong>In the present study, we proposed an accurate predictive model with the help of data fusion at the decision level. In this regard, results of filter feature selection and wrapper feature selection techniques were combined with multiple-criteria decision-making (MCDM) methods to select 10 genes from six microarray datasets that can be the most discriminative genes for diagnosing tuberculosis cases. As the main contribution of this study, the final ranking function was constructed by combining protein-protein interaction (PPI) network with an MCDM method (called Decision-making Trial and Evaluation Laboratory or DEMATEL) to improve the feature ranking approach.</p><p><strong>Results: </strong>By applying data fusion at the decision level on the 10 introduced genes in terms of fusion of classifiers of random forests (RF) and k-nearest neighbors (KNN) regarding Yager's theory, the proposed algorithm reached a sensitivity of 0.97, specificity of 0.90, and accuracy of 0.95. Finally, with the help of cumulative clustering, the genes involved in the diagnosis of latent and activated tuberculosis have been introduced.</p><p><strong>Conclusions: </strong>The combination of MCDM methods and PPI networks can significantly improve the diagnosis different states of tuberculosis.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"319"},"PeriodicalIF":3.3,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523813/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Katarina Gašperlin Stepančič, Ana Ramovš, Jože Ramovš, Andrej Košir
{"title":"A novel explainable machine learning-based healthy ageing scale.","authors":"Katarina Gašperlin Stepančič, Ana Ramovš, Jože Ramovš, Andrej Košir","doi":"10.1186/s12911-024-02714-w","DOIUrl":"10.1186/s12911-024-02714-w","url":null,"abstract":"<p><strong>Background: </strong>Ageing is one of the most important challenges in our society. Evaluating how one is ageing is important in many aspects, from giving personalized recommendations to providing insight for long-term care eligibility. Machine learning can be utilized for that purpose, however, user reservations towards \"black-box\" predictions call for increased transparency and explainability of results. This study aimed to explore the potential of developing a machine learning-based healthy ageing scale that provides explainable results that could be trusted and understood by informal carers.</p><p><strong>Methods: </strong>In this study, we used data from 696 older adults collected via personal field interviews as part of independent research. Explanatory factor analysis was used to find candidate healthy ageing aspects. For visualization of key aspects, a web annotation application was developed. Key aspects were selected by gerontologists who later used web annotation applications to evaluate healthy ageing for each older adult on a Likert scale. Logistic Regression, Decision Tree Classifier, Random Forest, KNN, SVM and XGBoost were used for multi-classification machine learning. AUC OvO, AUC OvR, F1, Precision and Recall were used for evaluation. Finally, SHAP was applied to best model predictions to make them explainable.</p><p><strong>Results: </strong>The experimental results show that human annotations of healthy ageing could be modelled using machine learning where among several algorithms XGBoost showed superior performance. The use of XGBoost resulted in 0.92 macro-averaged AuC OvO and 0.76 macro-averaged F1. SHAP was applied to generate local explanations for predictions and shows how each feature is influencing the prediction.</p><p><strong>Conclusion: </strong>The resulting explainable predictions make a step toward practical scale implementation into decision support systems. The development of such a decision support system that would incorporate an explainable model could reduce user reluctance towards the utilization of AI in healthcare and provide explainable and trusted insights to informal carers or healthcare providers as a basis to shape tangible actions for improving ageing. Furthermore, the cooperation with gerontology specialists throughout the process also indicates expert knowledge as integrated into the model.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"317"},"PeriodicalIF":3.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11520378/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alessandro Guazzo, Michele Atzeni, Elena Idi, Isotta Trescato, Erica Tavazzi, Enrico Longato, Umberto Manera, Adriano Chió, Marta Gromicho, Inês Alves, Mamede de Carvalho, Martina Vettoretti, Barbara Di Camillo
{"title":"Predicting clinical events characterizing the progression of amyotrophic lateral sclerosis via machine learning approaches using routine visits data: a feasibility study.","authors":"Alessandro Guazzo, Michele Atzeni, Elena Idi, Isotta Trescato, Erica Tavazzi, Enrico Longato, Umberto Manera, Adriano Chió, Marta Gromicho, Inês Alves, Mamede de Carvalho, Martina Vettoretti, Barbara Di Camillo","doi":"10.1186/s12911-024-02719-5","DOIUrl":"10.1186/s12911-024-02719-5","url":null,"abstract":"<p><strong>Background: </strong>Amyotrophic lateral sclerosis (ALS) is a progressive neurodegenerative disease that results in death within a short time span (3-5 years). One of the major challenges in treating ALS is its highly heterogeneous disease progression and the lack of effective prognostic tools to forecast it. The main aim of this study was, then, to test the feasibility of predicting relevant clinical outcomes that characterize the progression of ALS with a two-year prediction horizon via artificial intelligence techniques using routine visits data.</p><p><strong>Methods: </strong>Three classification problems were considered: predicting death (binary problem), predicting death or percutaneous endoscopic gastrostomy (PEG) (multiclass problem), and predicting death or non-invasive ventilation (NIV) (multiclass problem). Two supervised learning models, a logistic regression (LR) and a deep learning multilayer perceptron (MLP), were trained ensuring technical robustness and reproducibility. Moreover, to provide insights into model explainability and result interpretability, model coefficients for LR and Shapley values for both LR and MLP were considered to characterize the relationship between each variable and the outcome.</p><p><strong>Results: </strong>On the one hand, predicting death was successful as both models yielded F1 scores and accuracy well above 0.7. The model explainability analysis performed for this outcome allowed for the understanding of how different methodological approaches consider the input variables when performing the prediction. On the other hand, predicting death alongside PEG or NIV proved to be much more challenging (F1 scores and accuracy in the 0.4-0.6 interval).</p><p><strong>Conclusions: </strong>In conclusion, predicting death due to ALS proved to be feasible. However, predicting PEG or NIV in a multiclass fashion proved to be unfeasible with these data, regardless of the complexity of the methodological approach. The observed results suggest a potential ceiling on the amount of information extractable from the database, e.g., due to the intrinsic difficulty of the prediction tasks at hand, or to the absence of crucial predictors that are, however, not currently collected during routine practice.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 Suppl 4","pages":"318"},"PeriodicalIF":3.3,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11523576/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142543858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}