Antonio Blasco-Calafat, Vicent Blanes-Selva, Ascensión Doñate-Martínez, Tobias Fragner, Tamara Alhambra-Borrás, Julia Gawronska, Maria Moudatsou, Ioanna Tabaki, Katerina Belogianni, Pania Karnaki, Miguel Rico Varadé, Rosa Gómez Trenado, Jaime Barrio-Cortes, Lee Smith, Alejandro Gil-Salmeron, Igor Grabovac, Juan M García-Gómez
{"title":"An AI-based microsimulation for predicting health outcomes among people experiencing homelessness.","authors":"Antonio Blasco-Calafat, Vicent Blanes-Selva, Ascensión Doñate-Martínez, Tobias Fragner, Tamara Alhambra-Borrás, Julia Gawronska, Maria Moudatsou, Ioanna Tabaki, Katerina Belogianni, Pania Karnaki, Miguel Rico Varadé, Rosa Gómez Trenado, Jaime Barrio-Cortes, Lee Smith, Alejandro Gil-Salmeron, Igor Grabovac, Juan M García-Gómez","doi":"10.1016/j.cmpb.2025.109112","DOIUrl":null,"url":null,"abstract":"<p><strong>Background and objective: </strong>People experiencing homelessness (PEH) face higher cancer risk due to social exclusion, housing and limited access to healthcare. This study proposes a microsimulation model using machine learning (ML) to predict the effect of quality of life, healthcare utilisation and empowerment at the end of the intervention under the Health Navigator Model, enabling cost-effective resource allocation and identifying high-risk subgroups.</p><p><strong>Materials & methods: </strong>We used data from 652 PEH recruited in four European countries (June 2022 - November 2023); 255 completed an 18-month Health Navigator Model programme. Standardised questionnaires were administered at baseline, four weeks and post-intervention. A modular ML microsimulation was built that (1) creates a constraint-based synthetic cohort, (2) estimates outcome changes by matching each simulated case to real program completers, and (3) sums those differences to gauge the intervention's impact. Multiple ML techniques were tested to keep the synthetic sample true to the original and to improve effect-size predictions.</p><p><strong>Results: </strong>CTGAN generated the most realistic synthetic baseline (propensity score = 0.152; 95 % CI 0.148-0.162), markedly outperforming univariant, multivariant and SMOTE approaches (> 0.21). Regression models reproduced most numerical outcomes with good fidelity (e.g., EQ-5D-5L MAE = 0.10 on a 0-1 scale; Health-Rating MAE = 10 on a 0-100 scale), while categorical outcomes were predicted within roughly one category. Binary classifiers yielded F1-scores of 0.58 for smoking status and 0.64 for programme adherence. An online demonstrator (https://epione.upv.es) visualises the process.</p><p><strong>Conclusion: </strong>The proposed ML-based microsimulation generates realistic PEH profiles and projects intervention outcomes, providing a flexible, evidence-driven tool to optimise cancer-prevention strategies for PEH supporting evidence-based decision-making and optimise resource allocation, enhancing intervention outcomes by predicting the intervention before implementation.</p>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"273 ","pages":"109112"},"PeriodicalIF":4.8000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.cmpb.2025.109112","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Background and objective: People experiencing homelessness (PEH) face higher cancer risk due to social exclusion, housing and limited access to healthcare. This study proposes a microsimulation model using machine learning (ML) to predict the effect of quality of life, healthcare utilisation and empowerment at the end of the intervention under the Health Navigator Model, enabling cost-effective resource allocation and identifying high-risk subgroups.
Materials & methods: We used data from 652 PEH recruited in four European countries (June 2022 - November 2023); 255 completed an 18-month Health Navigator Model programme. Standardised questionnaires were administered at baseline, four weeks and post-intervention. A modular ML microsimulation was built that (1) creates a constraint-based synthetic cohort, (2) estimates outcome changes by matching each simulated case to real program completers, and (3) sums those differences to gauge the intervention's impact. Multiple ML techniques were tested to keep the synthetic sample true to the original and to improve effect-size predictions.
Results: CTGAN generated the most realistic synthetic baseline (propensity score = 0.152; 95 % CI 0.148-0.162), markedly outperforming univariant, multivariant and SMOTE approaches (> 0.21). Regression models reproduced most numerical outcomes with good fidelity (e.g., EQ-5D-5L MAE = 0.10 on a 0-1 scale; Health-Rating MAE = 10 on a 0-100 scale), while categorical outcomes were predicted within roughly one category. Binary classifiers yielded F1-scores of 0.58 for smoking status and 0.64 for programme adherence. An online demonstrator (https://epione.upv.es) visualises the process.
Conclusion: The proposed ML-based microsimulation generates realistic PEH profiles and projects intervention outcomes, providing a flexible, evidence-driven tool to optimise cancer-prevention strategies for PEH supporting evidence-based decision-making and optimise resource allocation, enhancing intervention outcomes by predicting the intervention before implementation.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.