Francesc Suñol, Candelaria de Haro, Verónica Santos-Pulpón, Sol Fernández-Gonzalo, Lluís Blanch, Josefina López-Aguilar, Leonardo Sarlabous
{"title":"Leveraging large language models for patient-ventilator asynchrony detection.","authors":"Francesc Suñol, Candelaria de Haro, Verónica Santos-Pulpón, Sol Fernández-Gonzalo, Lluís Blanch, Josefina López-Aguilar, Leonardo Sarlabous","doi":"10.1136/bmjhci-2024-101426","DOIUrl":"https://doi.org/10.1136/bmjhci-2024-101426","url":null,"abstract":"<p><strong>Objectives: </strong>The objective of this study is to evaluate whether large language models (LLMs) can achieve performance comparable to expert-developed deep neural networks in detecting flow starvation (FS) asynchronies during mechanical ventilation.</p><p><strong>Methods: </strong>Popular LLMs (GPT-4, Claude-3.5, Gemini-1.5, DeepSeek-R1) were tested on a dataset of 6500 airway pressure cycles from 28 patients, classifying breaths into three FS categories. They were also tasked with generating executable code for one-dimensional convolutional neural network (CNN-1D) and Long Short-Term Memory networks. Model performances were assessed using repeated holdout validation and compared with expert-developed models.</p><p><strong>Results: </strong>LLMs performed poorly in direct FS classification (accuracy: GPT-4: 0.497; Claude-3.5: 0.627; Gemini-1.5: 0.544, DeepSeek-R1: 0.520). However, Claude-3.5-generated CNN-1D code achieved the highest accuracy (0.902 (0.899-0.906)), outperforming expert-developed models.</p><p><strong>Discussion: </strong>LLMs demonstrated limited capability in direct classification but excelled in generating effective neural network models with minimal human intervention. This suggests LLMs' potential in accelerating model development for clinical applications, particularly for detecting patient-ventilator asynchronies, though their clinical implementation requires further validation and consideration of ethical factors.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144511513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Long Song, Uwe Aickelin, Timothy N Fazio, Abhishek Sharma, Mojgan Kouhounestani, Samantha Plumb, Mark John Putland
{"title":"Developing interpretable machine learning models to predict length of stay and disposition decision for adult patients in emergency departments.","authors":"Long Song, Uwe Aickelin, Timothy N Fazio, Abhishek Sharma, Mojgan Kouhounestani, Samantha Plumb, Mark John Putland","doi":"10.1136/bmjhci-2024-101152","DOIUrl":"https://doi.org/10.1136/bmjhci-2024-101152","url":null,"abstract":"<p><strong>Objective: </strong>Machine learning (ML) models have emerged as tools to predict length of stay (LOS) and disposition decision (DD) in emergency departments (EDs) to combat overcrowding. However, site-specific ML models are not transferable to different sites. Our objective was to develop interpretable ML models to predict LOS and DD at specific time points, all while establishing a transparent data analysis framework. This framework was designed to be easily adapted by other institutions for the development of their own ML models.</p><p><strong>Methods: </strong>We analysed data from 297 392 ED visits of patients aged 18 and above at a quaternary hospital between 30 June 2019 and 31 December 2022. Eight ML algorithms were evaluated, and ultimately, twelve lasso models built from 21 features were trained to predict four outcomes of LOS and DD at three time points post-triage. Hold-out testing and cross-validation were conducted for these models.</p><p><strong>Results: </strong>The area under the curve values were 0.862/0.868/0.878 for binary LOS predictions at 10, 60 and 120-minute time points and 0.839/0.851/0.863 for binary DD predictions. The accuracies were 60.2%/60.7%/61.9% for ternary LOS predictions and 61.5%/62.3%/63.4% for ternary DD predictions.</p><p><strong>Conclusions: </strong>Interpretable ML models demonstrated outstanding performances in predicting both LOS and DD. The transparent data analysis framework can be easily adapted by other institutions.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144511512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using technology acceptance model to explore physicians' perspectives of clinical decision support system alerts.","authors":"Shuo-Chen Chien, Chia-Hui Chien, Chun-You Chen, Po-Han Chien, Chun-Kung Hsu, Hsuan-Chia Yang, Yu-Chuan Li","doi":"10.1136/bmjhci-2024-101128","DOIUrl":"https://doi.org/10.1136/bmjhci-2024-101128","url":null,"abstract":"<p><strong>Objective: </strong>To examine factors influencing physicians' perspectives of clinical decision support system (CDSS) alerts using the Technology Acceptance Model (TAM), focusing on perceived ease of use (PEOU), perceived usefulness (PU), attitude towards usage (AT), user satisfaction (US) and behavioural intention to use (BI).</p><p><strong>Methods: </strong>This study was conducted in the outpatient departments of a single academic medical centre in northern Taiwan, involving 72 physicians who completed a structured TAM-based questionnaire. Seven physician's characteristics (age, clinical experience, CDSS operating status, patient volume, consultation frequency, gender and specialty) were analysed for their influence on PEOU and PU. Multiple regression analysis assessed relationships among TAM constructs and external factors.</p><p><strong>Results: </strong>Patient volume and age negatively affected PU and PEOU (eg, age vs PU: β=-2.38, p<0.05; patient volume vs PEOU: β=-2.64, p<0.01), while clinical experience positively influenced them (PEOU: β=2.11, p<0.05). TAM construct analysis revealed that PEOU positively influenced PU (β=0.67, p<0.001), AT (β=0.31, p<0.01), and US (β=0.35, p<0.001). No significant correlation was found between US and BI (p=0.96).</p><p><strong>Discussion: </strong>Findings suggest that PEOU significantly affects physicians' behavioural intention to use alerts, with high patient volume and older age lowering acceptance due to alert fatigue. Adaptive, context-aware CDSS alerts can improve usability and align better with clinical workflows, enhancing efficiency in high-demand environments.</p><p><strong>Conclusion: </strong>This study highlights the need for context-aware, frequency-optimised alert designs to enhance CDSS acceptance, improve user experience and streamline clinical workflows.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144511515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Preliminary study on objective evaluation algorithm of human infrared thermogram seriality and its clinical application in population with metabolic syndrome.","authors":"Jia-Yang Guo, Yan-Hong An, Yu Chen, Xian-Hui Zhang, Jia-Min Niu, Xiao-Ran Li, Hui-Zhong Xue, Yi-Meng Yang, Lu-Qi Cai, Yu-Chen Xia, Quan-Yi Chen, Bing-Yang Cai, Wen-Zheng Zhang, Yong-Hua Xiao","doi":"10.1136/bmjhci-2024-101252","DOIUrl":"https://doi.org/10.1136/bmjhci-2024-101252","url":null,"abstract":"<p><strong>Objectives: </strong>To develop an objective evaluation algorithm for assessing the seriality of infrared thermograms for the auxiliary diagnosis of diseases, and to internally validate the algorithm using metabolic syndrome (MS) as a case example.</p><p><strong>Methods: </strong>A total of 266 healthy participants (133 of each sex) and 180 patients with MS (133 males and 47 females) were retrospectively enrolled. Infrared thermograms were randomly divided into a training set and a validation set at a ratio of 3:1. According to the algorithm proposed in this article, the thermal sequence values of patients with MS were calculated and compared between the two groups. The area under the curve (AUC) was computed to evaluate the diagnostic performance of thermogram seriality in MS detection.</p><p><strong>Results: </strong>The established thermal sequence of healthy participants was as follows: T <sub>palm</sub><T <sub>lower leg</sub><T <sub>lower abdomen</sub>. In the training set, the AUCs for male and female patients with MS were 0.77 and 0.72, respectively, while in the validation set, they were 0.76 and 0.69, respectively. And results indicated that thermogram seriality demonstrated better diagnostic stability in males and in younger and middle-aged individuals. Additionally, higher body mass index values showed a positive correlation with increased thermal sequence values.</p><p><strong>Discussion: </strong>The study proposed a novel, objective algorithm for quantitatively evaluating thermogram seriality. By focusing on temperature sequences rather than absolute temperature values, the algorithm is expected to facilitate a more quantitative evaluation of thermogram features. It could improve the stability and reproducibility of MS diagnosis.</p><p><strong>Conclusion: </strong>The algorithm can quantitatively characterise thermogram seriality and can be used for the auxiliary screening of MS.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144511514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Olivier Duranteau, Benjamin Popoff, Axel Abels, Valerio Lucidi, Eric Savier, Florian Blanchard, Thibault Martinez, Patrizia Loi, Desislava Germanova, Anne Demulder, Jacques Creteur, Turgay Tuna
{"title":"Prediction of biological evolution following blood product transfusion during liver transplantation: the contribution of machine learning to decision-making.","authors":"Olivier Duranteau, Benjamin Popoff, Axel Abels, Valerio Lucidi, Eric Savier, Florian Blanchard, Thibault Martinez, Patrizia Loi, Desislava Germanova, Anne Demulder, Jacques Creteur, Turgay Tuna","doi":"10.1136/bmjhci-2025-101466","DOIUrl":"10.1136/bmjhci-2025-101466","url":null,"abstract":"<p><strong>Objectives: </strong>Liver transplantation is a complex procedure frequently requiring transfusion of blood products to manage coagulopathy and haemorrhage. This study aimed to develop machine learning models to predict the biological effects of blood product transfusions, assisting clinicians in selecting optimal therapeutic combinations.</p><p><strong>Methods: </strong>Using data from two cohorts over 20 years from two academic hospitals, 10 supervised machine learning models were trained and validated on four biomarkers: fibrinogen, haemoglobin, prothrombin time and activated partial thromboplastin time ratio. Models were evaluated using R², root mean squared error and SD metrics, with external validation performed on the second cohort.</p><p><strong>Results: </strong>The results indicated that while certain models, such as the stack model for late fibrinogen (R²=0.63) or the extra trees model for late prothrombin time (R²=0.66), demonstrated promising predictive capacity, the overall external validation performance was suboptimal. Despite the use of a large healthcare database, a rigorous statistical methodology and an academic machine learning methodology, most models showed limited generalisability (R² < 0.5).</p><p><strong>Discussion: </strong>Key limitations included the small dataset size relative to machine learning requirements, lack of advanced haemostatic parameters (eg, ROtational ThromboElastoMetry (ROTEM) or Thromboelastography (TEG)) and the variability introduced by evolving surgical practices over the 20-year study period. Despite these limitations, this study provides a reproducible framework for evaluating transfusion efficacy, supported by openly shared Python code and the application of Taylor diagrams for model evaluation.</p><p><strong>Conclusion: </strong>While our models are unsuitable for routine clinical use, they highlight the potential of machine learning in transfusion medicine. Future work should focus on integrating larger datasets, advanced biomarkers and real-time data.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12184408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144367868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji-Hyun Kim, KyungHyun Lee, Kwang Joon Kim, Eun Yeong Ha, In-Cheol Kim, Sun Hyo Park, Chi-Heum Cho, Gyeong Im Yu, Byung Eun Ahn, Yeeun Jeong, Joo-Yun Won, Taeyong Sim, Hochan Cho, Ki-Byung Lee
{"title":"Validation of an artificial intelligence-based algorithm for predictive performance and risk stratification of sepsis using real-world data from hospitalised patients: a prospective observational study.","authors":"Ji-Hyun Kim, KyungHyun Lee, Kwang Joon Kim, Eun Yeong Ha, In-Cheol Kim, Sun Hyo Park, Chi-Heum Cho, Gyeong Im Yu, Byung Eun Ahn, Yeeun Jeong, Joo-Yun Won, Taeyong Sim, Hochan Cho, Ki-Byung Lee","doi":"10.1136/bmjhci-2024-101353","DOIUrl":"10.1136/bmjhci-2024-101353","url":null,"abstract":"<p><strong>Objective: </strong>The heterogeneous nature of sepsis renders determining its underlying causes difficult, which may delay diagnosis and intervention. VitalCare-SEPsis Score (VC-SEPS) is a deep learning-based algorithm that predicts sepsis and monitors patient conditions based on electronic medical record data. However, few studies have prospectively compared medical artificial intelligence software algorithms and traditional scoring systems to predict sepsis. This prospective observational study attempted to validate the predictive performance and risk stratification of VC-SEPS for early prediction of sepsis.</p><p><strong>Methods: </strong>In this prospective observational study, we collected electronic medical record data from 6,797 patients hospitalised at Keimyung University Dongsan Hospital, Daegu, South Korea. The final version of the analysed set included 6,455 patients, 325 of whom were diagnosed with sepsis.</p><p><strong>Results: </strong>The area under the receiver operating characteristic curve of VC-SEPS was 0.880, indicating its superiority over traditional scoring systems. The algorithm performance showed a consistent trend within 24 hours. On patients' initial admission, the VC-SEPS was associated with the risk of developing sepsis, and the score accurately predicted sepsis by an average of 68.05 min compared with diagnosis time by an operational definition of sepsis.</p><p><strong>Discussion: </strong>VC-SEPS could assist medical staff with early diagnosis and intervention in clinical practice by providing a sepsis risk score. Prompt recognition assisting recognition can significantly help shorten the time between recognition and intervention in clinical decision-making processes.</p><p><strong>Conclusion: </strong>This study suggests that using a clinical decision support system can help improve hospital workflows as well as the quality of medical care.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12182025/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144336308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Usability evaluation of a DHIS2-based electronic information management system for environmental, occupational health and food safety in Sri Lanka.","authors":"Prabhadini Godage, Sapumal Dhanapala, Achala Jayatilleke","doi":"10.1136/bmjhci-2024-101357","DOIUrl":"https://doi.org/10.1136/bmjhci-2024-101357","url":null,"abstract":"<p><strong>Objectives: </strong>The Public Health Inspector (PHI) Monthly Report is a critical document that provides insights into environmental, occupational health and food safety aspects within each Medical Officer of Health area in Sri Lanka. Currently, PHIs use a paper format to track these key health indicators, resulting in incomplete and inaccurate national data. This study evaluates the usability of a DHIS2 (District Health Information Software 2) based digital solution to improve PHI reporting.</p><p><strong>Methods: </strong>The DHIS2 system was customised to address the gaps in the current reporting process, and its usability was evaluated using the System Usability Scale (SUS) with 50 stakeholders who tested the system.</p><p><strong>Results: </strong>The DHIS2 platform was flexible enough to be customised to meet the requirements of the new electronic Environmental, Occupational Health and Food Safety Information Management System (eEOHFSIMS). The system achieved an average SUS score of 72.25, exceeding the accepted benchmark of 68, with a high SD of 13.37. However, a 92% knowledge gap remained.</p><p><strong>Discussion: </strong>Digitising the PHI monthly report using DHIS2 addresses the challenges of traditional paper-based reporting, enabling timely monitoring of public health indicators. The favourable SUS score confirms the system's high usability, yet the knowledge gap underscores the need for ongoing user training to ensure data quality.</p><p><strong>Conclusions: </strong>The eEOHFSIMS demonstrated its capacity to deliver accurate, complete and timely data, greatly benefiting Sri Lanka's primary healthcare services. This system enhancement supports better-informed decision-making, aligns with national health policies and enables continuous monitoring and evaluation of public health services.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12164650/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144293280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesse P A Demandt, Thomas P Mast, Konrad A J van Beek, Arjan Koks, Marieke C V Bastiaansen, Pim A L Tonino, Marcel van 't Veer, Frederik M Zimmermann, Pieter-Jan Vlaar
{"title":"Towards prehospital risk stratification using deep learning for ECG interpretation in suspected acute coronary syndrome.","authors":"Jesse P A Demandt, Thomas P Mast, Konrad A J van Beek, Arjan Koks, Marieke C V Bastiaansen, Pim A L Tonino, Marcel van 't Veer, Frederik M Zimmermann, Pieter-Jan Vlaar","doi":"10.1136/bmjhci-2024-101292","DOIUrl":"10.1136/bmjhci-2024-101292","url":null,"abstract":"<p><strong>Objectives: </strong>Most patients presenting with chest pain in the emergency medical services (EMS) setting are suspected of non-ST-elevation acute coronary syndrome (NSTE-ACS). Distinguishing true NSTE-ACS from non-cardiac chest pain based solely on the ECG is challenging. The aim of this study is to develop and validate a convolutional neural network (CNN)-based model for risk stratification of suspected NSTE-ACS patients and to compare its performance with currently available prehospital diagnostic tools.</p><p><strong>Methods: </strong>For this study, an internal training cohort and an external validation cohort were used, both consisting of suspected NSTE-ACS patients. A CNN (ECG interpretation by CNN (ECG-AI)) was trained and validated to detect NSTE-ACS. The diagnostic value of ECG-AI in detecting NSTE-ACS was compared with on-site ECG analyses by an EMS paramedic (ECG-EMS), point-of-care troponin assessment and a validated prehospital clinical risk score (prehospital History, ECG, Age, Risk factors and POC-troponin (preHEART)).</p><p><strong>Results: </strong>A total of 5645 patients suspected of NSTE-ACS were included. In the external validation cohort (n=754), 27% were diagnosed with NSTE-ACS. ECG-AI had a better diagnostic performance than ECG-EMS (area under the curve (AUROC) 0.70 (0.66 to 0.74) vs AUROC 0.65 (0.61 to 0.70), p=0.045) for diagnosing NSTE-ACS. The overall diagnostic accuracy of preHEART was AUROC 0.78 (0.74 to 0.82) and superior compared with ECG-AI (p=0.001). Incorporating ECG-AI into preHEART led to a significant improvement in diagnostic performance (AUROC 0.83 (0.79 to 0.86), p<0.001).</p><p><strong>Discussion: </strong>Correctly identifying patients who are at low risk for having NSTE-ACS is crucial for optimal triage in the prehospital setting. Recent studies have shown that these low-risk patients could potentially be left at home or transferred to a general practitioner, leading to less emergency department overcrowding and lower healthcare costs. Other studies demonstrated better overall diagnostic performance compared with our artificial intelligence (AI) model. However, these studies were aimed at a study population with a high prevalence of occlusive myocardial infarction, which could explain the differing levels of diagnostic performance.</p><p><strong>Conclusion: </strong>Integrating AI in prehospital ECG interpretation improves the identification of patients at low risk for having NSTE-ACS. Nonetheless, clinical risk scores currently yield the best diagnostic performance and their accuracy could be further enhanced through AI. Our results pave the way for new studies focused on exploring the role of AI in prehospital risk-stratification efforts.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12161418/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144246356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neil Bodagh, Kyaw Soe Tun, Adam Barton, Malihe Javidi, Darwon Rashid, Rachel Burns, Irum Kotadia, Magda Klis, Ali Gharaviri, Vinush Vigneswaran, Steven Niederer, Mark O'Neill, Miguel O Bernabeu, Steven E Williams
{"title":"GenECG: a synthetic image-based ECG dataset to augment artificial intelligence-enhanced algorithm development.","authors":"Neil Bodagh, Kyaw Soe Tun, Adam Barton, Malihe Javidi, Darwon Rashid, Rachel Burns, Irum Kotadia, Magda Klis, Ali Gharaviri, Vinush Vigneswaran, Steven Niederer, Mark O'Neill, Miguel O Bernabeu, Steven E Williams","doi":"10.1136/bmjhci-2024-101335","DOIUrl":"10.1136/bmjhci-2024-101335","url":null,"abstract":"<p><strong>Objectives: </strong>An image-based ECG dataset incorporating visual imperfections common to paper-based ECGs, which are typically scanned or photographed into electronic health records, could facilitate clinically useful artificial intelligence (AI)-ECG algorithm development. This study aimed to create a high-fidelity, synthetic image-based ECG dataset.</p><p><strong>Methods: </strong>ECG images were recreated from the PTB-XL database, a signal-based dataset and image manipulation techniques were applied to mimic imperfections associated with ECGs in real-world settings. Clinical Turing tests were conducted to evaluate the fidelity of the synthetic images, and the performance of current AI-ECG algorithms was assessed using synthetic images containing visual imperfections.</p><p><strong>Results: </strong>GenECG, an image-based dataset containing 21 799 ECGs with visual imperfections encountered in routine clinical care paired with imperfection-free images, was created. Turing tests confirmed the realism of the images: expert observer accuracy of discrimination between real-world and synthetic ECGs fell from 63.9% (95% CI 58.0% to 69.8%) to 53.3% (95% CI 48.6% to 58.1%) over three rounds of testing, indicating that observers could not distinguish between synthetic and real ECGs. The performance of pre-existing algorithms on synthetic (area under the curve (AUC) 0.592, 95% CI 0.421 to 0.763) and real-world (AUC 0.647, 95% CI 0.520 to 0.774) ECG images containing imperfections was limited. Algorithm fine-tuning with GenECG data improved real-world ECG classification accuracy (AUC 0.821, 95% CI 0.730 to 0.913) demonstrating its potential to augment image-based algorithm development.</p><p><strong>Discussion/conclusion: </strong>GenECG is the first synthetic image-based ECG dataset to pass a clinical Turing test. The dataset will enable image-based AI-ECG algorithm development, ensuring utility in low resource areas, prehospital settings and hospital environments where signal data are unavailable.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12142132/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144198160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kate Rich, Ronelle Burger, Deanne Goldberg, Harry Moultrie, Matthias Rieger
{"title":"Is it possible to encourage TB testing and detect missing TB cases via community-level promotion of a self-screening mobile application? Quasi-experimental evidence from South Africa.","authors":"Kate Rich, Ronelle Burger, Deanne Goldberg, Harry Moultrie, Matthias Rieger","doi":"10.1136/bmjhci-2024-101179","DOIUrl":"10.1136/bmjhci-2024-101179","url":null,"abstract":"<p><strong>Objectives: </strong>While mobile health (mHealth) interventions are widespread, few studies assess impacts at the population level in low-income and middle-income countries. South Africa's tuberculosis (TB) burden is high, and a substantial share of cases remain undiagnosed. We evaluate the impacts of community activations of TBCheck-a WhatsApp/USSD-based chatbot that allows individuals to evaluate themselves for TB risk.</p><p><strong>Methods: </strong>We use a quasi-experimental approach comparing treated and control subdistricts nationally before and after community activations using dashboard data from the TBCheck platform and weekly or quarterly subdistrict TB test data from the National Health Laboratory Service. Dependent variables are the number of self-screening tests on the platform, total tests and number of positive tests per subdistrict. We employ dynamic difference-in-difference models accounting for subdistrict unobservables and time trends using weekly data, and synthetic control methods matching on preintervention trends in outcomes using quarterly data.</p><p><strong>Results: </strong>Impact estimates suggest an increase in the number of self-screening tests on the platform (487.53, p-value<0.01) as well as TB tests (107.90, p-value=0.05) in treated relative to control subdistricts due to intervention activities in the week of the intervention. After 2 weeks, impacts on the number of self-screening tests are insignificant (-6.18, p=0.23), and after 1 week, impacts on TB tests are insignificant (36.44, p-value=0.32).</p><p><strong>Discussion and conclusion: </strong>Activation activities associated with TBCheck led to short-lived and variable impacts on uptake and tests in target subdistricts. Alternative strategies are required for sustained uptake of such mHealth tools.</p>","PeriodicalId":9050,"journal":{"name":"BMJ Health & Care Informatics","volume":"32 1","pages":""},"PeriodicalIF":4.1,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12128445/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144198161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}