{"title":"Barriers and Enablers to Using a Mobile App-Based Clinical Decision Support System in Managing Perioperative Adverse Events Among Anesthesia Providers: Cross-Sectional Survey in China.","authors":"Xixia Feng, Peiyi Li, Renjie Zhao, Weimin Li, Tao Zhu, Xuechao Hao, Guo Chen","doi":"10.2196/60304","DOIUrl":"10.2196/60304","url":null,"abstract":"<p><strong>Background: </strong>Perioperative adverse events (PAEs) pose a substantial global health burden, contributing to elevated morbidity, mortality, and health care expenditures. The adoption of clinical decision support systems (CDSS), particularly mobile-based solutions, offers a promising avenue to address these challenges. However, successful implementation hinges on understanding anesthesia providers' knowledge, attitudes, and willingness to embrace such technologies.</p><p><strong>Objective: </strong>This study aimed to evaluate the knowledge, attitudes, and willingness of Chinese anesthesia professionals to adopt a mobile CDSS for PAE management, and to identify key factors influencing its implementation.</p><p><strong>Methods: </strong>A nationwide cross-sectional survey was conducted among anesthesia providers in China from September 5 to December 31, 2023. Participants included anesthesiologists and nurse anesthetists, who play pivotal roles in perioperative care. A 51-item questionnaire, structured around the Knowledge-Attitude-Practice (KAP) framework, was distributed via WeChat through professional anesthesia associations. The questionnaire covered four domains: (1) demographic characteristics, (2) knowledge assessment, (3) attitude evaluation, and (4) practice willingness. Multivariable regression analyses identified predictors of KAP outcomes, with sensitivity analyses focusing on nurse anesthetists.</p><p><strong>Results: </strong>The study included 2440 anesthesia professionals (2226 anesthesiologists and 214 nurse anesthetists). Overall, 87.3% (2130/2440) expressed willingness to adopt the CDSS, with 87.5% (1947/2226) of anesthesiologists and 85.5% (183/214) of nurse anesthetists showing readiness. However, only 39.2% (956/2440) were satisfied with existing incident management systems. Key findings indicated that higher knowledge scores were associated with female gender (coefficient=0.19, P=.003), advanced education, and lack of previous informatics experience (coefficient=0.29, P<.001). Nurse anesthetists scored lower than anesthesiologists (coefficient=-0.76, P<.001). Negative attitudes were more prevalent among older practitioners (coefficient=-0.13, P<.001), females (coefficient=-0.66, P<.001), nurse anesthetists (coefficient=-1.12, P=.003), and those without prior PAE exposure (coefficient=-0.97, P<.001). Higher willingness was observed among practitioners in Southwest China (coefficient=0.10, P=.048), those with positive attitudes (coefficient=0.06, P<.001), and those dissatisfied (coefficient=0.32, P<.001) or neutral (coefficient=0.11, P=.02) towards existing systems. Infrequent departmental incident discussions would reduce practice willingness (coefficient=-0.08, P=.01).</p><p><strong>Conclusions: </strong>This national study highlights a strong readiness among Chinese anesthesia professionals to adopt mobile CDSS for PAE management. However, critical barriers, including role-specific knowledge disparities and i","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e60304"},"PeriodicalIF":5.8,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117274/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144022404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Clare E Palmer, Emily Marshall, Edward Millgate, Graham Warren, Michael Ewbank, Elisa Cooper, Samantha Lawes, Alastair Smith, Chris Hutchins-Joss, Jessica Young, Malika Bouazzaoui, Morad Margoum, Sandra Healey, Louise Marshall, Shaun Mehew, Ronan Cummins, Valentin Tablan, Ana Catarino, Andrew E Welchman, Andrew D Blackwell
{"title":"Combining Artificial Intelligence and Human Support in Mental Health: Digital Intervention With Comparable Effectiveness to Human-Delivered Care.","authors":"Clare E Palmer, Emily Marshall, Edward Millgate, Graham Warren, Michael Ewbank, Elisa Cooper, Samantha Lawes, Alastair Smith, Chris Hutchins-Joss, Jessica Young, Malika Bouazzaoui, Morad Margoum, Sandra Healey, Louise Marshall, Shaun Mehew, Ronan Cummins, Valentin Tablan, Ana Catarino, Andrew E Welchman, Andrew D Blackwell","doi":"10.2196/69351","DOIUrl":"10.2196/69351","url":null,"abstract":"<p><strong>Background: </strong>Escalating mental health demand exceeds existing clinical capacity, necessitating scalable digital solutions. However, engagement remains challenging. Conversational agents can enhance engagement by making digital programs more interactive and personalized, but they have not been widely adopted. This study evaluated a digital program for anxiety in comparison to external comparators. The program used an artificial intelligence (AI)-driven conversational agent to deliver clinician-written content via machine learning, with clinician oversight and user support.</p><p><strong>Objective: </strong>This study aims to evaluate the engagement, effectiveness, and safety of this structured, evidence-based digital program with human support for mild, moderate, and severe generalized anxiety. Statistical analyses sought to determine whether the program reduced anxiety more than a propensity-matched waiting control and was statistically noninferior to real-world, propensity-matched face-to-face and typed cognitive behavioral therapy (CBT).</p><p><strong>Methods: </strong>Prospective participants (N=299) were recruited from the National Health Service (NHS) or social media in the United Kingdom and given access to the digital program for up to 9 weeks (study conducted from October 2023 to May 2024). End points were collected before, during, and after the digital program, as well as at a 1-month follow-up. External comparator groups were created through propensity matching of the digital program sample with NHS Talking Therapies (NHS TT) data from ieso Digital Health (typed CBT) and Dorset HealthCare (DHC) University NHS Foundation Trust (face-to-face CBT). Superiority and noninferiority analyses were conducted to compare anxiety symptom reduction (change on the 7-item Generalized Anxiety Disorder Scale [GAD-7]) between the digital program group and the external comparator groups. The program included human support, and clinician time spent per participant was calculated.</p><p><strong>Results: </strong>Participants used the program for a median of 6 hours over 53 days, with 232 of the 299 (77.6%) engaged (ie, completing a median of 2 hours over 14 days). There was a large, clinically meaningful reduction in anxiety symptoms for the digital program group (per-protocol [PP; n=169]: mean GAD-7 change -7.4, d=1.6; intention-to-treat [ITT; n= 99]: mean GAD-7 change -5.4, d=1.1). The PP effect was statistically superior to the waiting control (d=1.3) and noninferior to the face-to-face CBT group (P<.001) and the typed CBT group (P<.001). Similarly, for the ITT sample, the digital program showed superiority to waiting control (d=0.8) and noninferiority to face-to-face CBT (P=.002), with noninferiority to typed CBT approaching significance (P=.06). Effects were sustained at the 1-month follow-up. Clinicians overseeing the digital program spent a mean of 1.6 hours (range 31-200 minutes) of clinician time in sessions per participant.</p><","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":" ","pages":"e69351"},"PeriodicalIF":5.8,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117275/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143730436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jung Ki Jo, Yeeun Kim, Yun-Sok Ha, Kwang Taek Kim, Sangjun Yoo, Woo Suk Choi, Jihye Yang, Jaeeun Shin, Sun Il Kim, Jeong Hyun Kim
{"title":"Analyzing Usage of the Metaverse by Associations of Patients With Prostate Cancer During the 2023 Blue Ribbon Campaign: Cross-Sectional Survey Study.","authors":"Jung Ki Jo, Yeeun Kim, Yun-Sok Ha, Kwang Taek Kim, Sangjun Yoo, Woo Suk Choi, Jihye Yang, Jaeeun Shin, Sun Il Kim, Jeong Hyun Kim","doi":"10.2196/63030","DOIUrl":"10.2196/63030","url":null,"abstract":"<p><strong>Background: </strong>It is important to explain early diagnosis and treatment plans to patients of prostate cancer due to the different stages that diagnosis is made at and the corresponding stage-specific treatment options, as well as the varying prognoses depending on the choices made. Although various studies have implemented metaverse-based interventions across diverse clinical settings for medical education, there is a lack of publications addressing the implementation and validation of patient education using this technology.</p><p><strong>Objective: </strong>This study explored the potential of the metaverse as an educational and informational tool for prostate cancer. We measured and analyzed participants' satisfaction and perceptions following a metaverse-based prostate cancer awareness campaign. We also evaluated the feasibility and potential effectiveness of the metaverse as a platform for hosting a virtual patient association and delivering health education.</p><p><strong>Methods: </strong>The study was conducted via a questionnaire administered from September 15 to October 20, 2023, during the Blue Ribbon Campaign organized by the Korean Urological Association and the Korean Society of Urological Oncology. The postevent questionnaire was designed to assess the effectiveness of using the metaverse to increase awareness of prostate cancer. A total of 119 participants, including patients, caregivers, and members of the general population, completed the survey within the metaverse space and assessed their satisfaction and perceived awareness using a 5-point Likert scale.</p><p><strong>Results: </strong>The mean educational satisfaction score was 4.17 (SD 0.65), the mean psychological satisfaction score was 4.06 (SD 0.70), the mean overall satisfaction score was 4.12 (SD 0.72), and the mean awareness score was 4.09 (SD 0.72) out of a possible 5 points. Among responses rated 4 or higher (\"agree\" or \"strongly agree\"), 82.8% (394/476) were in the educational aspect, 76.6% (365/476) in psychological satisfaction, 81% (289/357) in overall satisfaction, and 80.4% (287/357) in awareness. Statistical analysis revealed significant differences in psychological (median 4.0, IQR 3.50-4.63, vs median 4.50, IQR 4.0-4.56) and overall (median 4.0, IQR 3.67-4.83, vs median 4.33, IQR 4.0-4.67) aspects between the general population group and patients and caregivers (median 4.0, IQR 3.33-4.33, vs median 4.67, IQR 4.0-4.67).</p><p><strong>Conclusions: </strong>The findings suggest that the metaverse holds promise as a platform for health care education and patient support, offering accessible and engaging experiences for patients, caregivers, and members of the general population. Our approach demonstrated a positive influence on participants' satisfaction and perceived awareness, highlighting its potential to enhance health communication and patient engagement. Despite these encouraging results, limitations, such as the sample being skewed toward","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e63030"},"PeriodicalIF":5.8,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117273/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144022626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinxi Yang, Siyao Zeng, Shanpeng Cui, Junbo Zheng, Hongliang Wang
{"title":"Predictive Modeling of Acute Respiratory Distress Syndrome Using Machine Learning: Systematic Review and Meta-Analysis.","authors":"Jinxi Yang, Siyao Zeng, Shanpeng Cui, Junbo Zheng, Hongliang Wang","doi":"10.2196/66615","DOIUrl":"10.2196/66615","url":null,"abstract":"<p><strong>Background: </strong>Acute respiratory distress syndrome (ARDS) is a critical condition commonly encountered in the intensive care unit (ICU), characterized by a high incidence and substantial mortality rate. Early detection and accurate prediction of ARDS can significantly improve patient outcomes. While machine learning (ML) models are increasingly being used for ARDS prediction, there is a lack of consensus on the most effective model or methodology. This study is the first to systematically evaluate the performance of ARDS prediction models based on multiple quantitative data sources. We compare the effectiveness of ML models via a meta-analysis, revealing factors affecting performance and suggesting strategies to enhance generalization and prediction accuracy.</p><p><strong>Objective: </strong>This study aims to evaluate the performance of existing ARDS prediction models through a systematic review and meta-analysis, using metrics such as area under the receiver operating characteristic curve, sensitivity, specificity, and other relevant indicators. The findings will provide evidence-based insights to support the development of more accurate and effective ARDS prediction tools.</p><p><strong>Methods: </strong>We performed a search across 6 electronic databases for studies developing ML predictive models for ARDS, with a cutoff date of December 29, 2024. The risk of bias in these models was evaluated using the Prediction model Risk of Bias Assessment Tool. Meta-analyses and investigations into heterogeneity were carried out using Meta-DiSc software (version 1.4), developed by the Ramón y Cajal Hospital's Clinical Biostatistics team in Madrid, Spain. Furthermore, sensitivity, subgroup, and meta-regression analyses were used to explore the sources of heterogeneity more comprehensively.</p><p><strong>Results: </strong>ML models achieved a pooled area under the receiver operating characteristic curve of 0.7407 for ARDS. The additional metrics were as follows: sensitivity was 0.67 (95% CI 0.66-0.67; P<.001; I²=97.1%), specificity was 0.68 (95% CI 0.67-0.68; P<.001; I²=98.5%), the diagnostic odds ratio was 6.26 (95% CI 4.93-7.94; P<.001; I²=95.3%), the positive likelihood ratio was 2.80 (95% CI 2.46-3.19; P<.001; I²=97.3%), and the negative likelihood ratio was 0.51 (95% CI 0.46-0.57; P<.001; I²=93.6%).</p><p><strong>Conclusions: </strong>This study evaluates prediction models constructed using various ML algorithms, with results showing that ML demonstrates high performance in ARDS prediction. However, many of the existing models still have limitations. During model development, it is essential to focus on model quality, including reducing bias risk, designing appropriate sample sizes, conducting external validation, and ensuring model interpretability. Additionally, challenges such as physician trust and the need for prospective validation must also be addressed. Future research should standardize model development, optimize model perf","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e66615"},"PeriodicalIF":5.8,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12117268/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143975822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluation and Bias Analysis of Large Language Models in Generating Synthetic Electronic Health Records: Comparative Study.","authors":"Ruochen Huang, Honghan Wu, Yuhan Yuan, Yifan Xu, Hao Qian, Changwei Zhang, Xin Wei, Shan Lu, Xin Zhang, Jingbao Kan, Cheng Wan, Yun Liu","doi":"10.2196/65317","DOIUrl":"10.2196/65317","url":null,"abstract":"<p><strong>Background: </strong>Synthetic electronic health records (EHRs) generated by large language models (LLMs) offer potential for clinical education and model training while addressing privacy concerns. However, performance variations and demographic biases in these models remain underexplored, posing risks to equitable health care.</p><p><strong>Objective: </strong>This study aimed to systematically assess the performance of various LLMs in generating synthetic EHRs and to critically evaluate the presence of gender and racial biases in the generated outputs. We focused on assessing the completeness and representativeness of these EHRs across 20 diseases with varying demographic prevalence.</p><p><strong>Methods: </strong>A framework was developed to generate 140,000 synthetic EHRs using 10 standardized prompts across 7 LLMs. The electronic health record performance score (EPS) was introduced to quantify completeness, while the statistical parity difference (SPD) was proposed to assess the degree and direction of demographic bias. Chi-square tests were used to evaluate the presence of bias across demographic groups.</p><p><strong>Results: </strong>Larger models exhibited superior performance but heightened biases. The Yi-34B achieved the highest EPS (96.8), while smaller models (Qwen-1.8B: EPS=63.35) underperformed. Sex polarization emerged: female-dominated diseases (eg, multiple sclerosis) saw amplified female representation in outputs (Qwen-14B: 973/1000, 97.3% female vs 564,424/744,778, 75.78% real; SPD=+21.50%), while balanced diseases and male-dominated diseases skewed the male group (eg, hypertension Llama 2-13 B: 957/1000, 95.7% male vs 79,540,040/152,466,669, 52.17% real; SPD=+43.50%). Racial bias patterns revealed that some models overestimated the representation of White (eg, Yi-6B: mean SPD +14.40%, SD 16.22%) or Black groups (eg, Yi-34B: mean SPD +14.90%, SD 27.16%), while most models systematically underestimated the representation of Hispanic (average SPD across 7 models is -11.93%, SD 8.36%) and Asian groups (average SPD across 7 models is -0.77%, SD 11.99%).</p><p><strong>Conclusions: </strong>Larger models, such as Yi-34B, Qwen-14B, and Llama 2 to 13 B, showed improved performance in generating more comprehensive EHRs, as reflected in higher EPS values. However, this increased performance was accompanied by a notable escalation in both gender and racial biases, highlighting a performance-bias trade-off. The study identified 4 key findings as follows: (1) as model size increased, EHR generation improved, but demographic biases also became more pronounced; (2) biases were observed across all models, not just the larger ones; (3) gender bias closely aligned with real-world disease prevalence, while racial bias was evident in only a subset of diseases; and (4) racial biases varied, with some diseases showing overrepresentation of White or Black populations and underrepresentation of Hispanic and Asian groups. These findings u","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e65317"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107208/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144014615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaowei Guan, Vivian Hui, Gregor Stiglic, Rose Eva Constantino, Young Ji Lee, Arkers Kwan Ching Wong
{"title":"Classifying the Information Needs of Survivors of Domestic Violence in Online Health Communities Using Large Language Models: Prediction Model Development and Evaluation Study.","authors":"Shaowei Guan, Vivian Hui, Gregor Stiglic, Rose Eva Constantino, Young Ji Lee, Arkers Kwan Ching Wong","doi":"10.2196/65397","DOIUrl":"10.2196/65397","url":null,"abstract":"<p><strong>Background: </strong>Domestic violence (DV) is a significant public health concern affecting the physical and mental well-being of numerous women, imposing a substantial health care burden. However, women facing DV often encounter barriers to seeking in-person help due to stigma, shame, and embarrassment. As a result, many survivors of DV turn to online health communities as a safe and anonymous space to share their experiences and seek support. Understanding the information needs of survivors of DV in online health communities through multiclass classification is crucial for providing timely and appropriate support.</p><p><strong>Objective: </strong>The objective was to develop a fine-tuned large language model (LLM) that can provide fast and accurate predictions of the information needs of survivors of DV from their online posts, enabling health care professionals to offer timely and personalized assistance.</p><p><strong>Methods: </strong>We collected 294 posts from Reddit subcommunities focused on DV shared by women aged ≥18 years who self-identified as experiencing intimate partner violence. We identified 8 types of information needs: shelters/DV centers/agencies; legal; childbearing; police; DV report procedure/documentation; safety planning; DV knowledge; and communication. Data augmentation was applied using GPT-3.5 to expand our dataset to 2216 samples by generating 1922 additional posts that imitated the existing data. We adopted a progressive training strategy to fine-tune GPT-3.5 for multiclass text classification using 2032 posts. We trained the model on 1 class at a time, monitoring performance closely. When suboptimal results were observed, we generated additional samples of the misclassified ones to give them more attention. We reserved 184 posts for internal testing and 74 for external validation. Model performance was evaluated using accuracy, recall, precision, and F<sub>1</sub>-score, along with CIs for each metric.</p><p><strong>Results: </strong>Using 40 real posts and 144 artificial intelligence-generated posts as the test dataset, our model achieved an F<sub>1</sub>-score of 70.49% (95% CI 60.63%-80.35%) for real posts, outperforming the original GPT-3.5 and GPT-4, fine-tuned Llama 2-7B and Llama 3-8B, and long short-term memory. On artificial intelligence-generated posts, our model attained an F<sub>1</sub>-score of 84.58% (95% CI 80.38%-88.78%), surpassing all baselines. When tested on an external validation dataset (n=74), the model achieved an F<sub>1</sub>-score of 59.67% (95% CI 51.86%-67.49%), outperforming other models. Statistical analysis revealed that our model significantly outperformed the others in F<sub>1</sub>-score (P=.047 for real posts; P<.001 for external validation posts). Furthermore, our model was faster, taking 19.108 seconds for predictions versus 1150 seconds for manual assessment.</p><p><strong>Conclusions: </strong>Our fine-tuned LLM can accurately and efficiently extract and identify","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e65397"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107195/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144017321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large Language Models and Artificial Neural Networks for Assessing 1-Year Mortality in Patients With Myocardial Infarction: Analysis From the Medical Information Mart for Intensive Care IV (MIMIC-IV) Database.","authors":"Boqun Shi, Liangguo Chen, Shuo Pang, Yue Wang, Shen Wang, Fadong Li, Wenxin Zhao, Pengrong Guo, Leli Zhang, Chu Fan, Yi Zou, Xiaofan Wu","doi":"10.2196/67253","DOIUrl":"10.2196/67253","url":null,"abstract":"<p><strong>Background: </strong>Accurate mortality risk prediction is crucial for effective cardiovascular risk management. Recent advancements in artificial intelligence (AI) have demonstrated potential in this specific medical field. Qwen-2 and Llama-3 are high-performance, open-source large language models (LLMs) available online. An artificial neural network (ANN) algorithm derived from the SWEDEHEART (Swedish Web System for Enhancement and Development of Evidence-Based Care in Heart Disease Evaluated According to Recommended Therapies) registry, termed SWEDEHEART-AI, can predict patient prognosis following acute myocardial infarction (AMI).</p><p><strong>Objective: </strong>This study aims to evaluate the 3 models mentioned above in predicting 1-year all-cause mortality in critically ill patients with AMI.</p><p><strong>Methods: </strong>The Medical Information Mart for Intensive Care IV (MIMIC-IV) database is a publicly available data set in critical care medicine. We included 2758 patients who were first admitted for AMI and discharged alive. SWEDEHEART-AI calculated the mortality rate based on each patient's 21 clinical variables. Qwen-2 and Llama-3 analyzed the content of patients' discharge records and directly provided a 1-decimal value between 0 and 1 to represent 1-year death risk probabilities. The patients' actual mortality was verified using follow-up data. The predictive performance of the 3 models was assessed and compared using the Harrell C-statistic (C-index), the area under the receiver operating characteristic curve (AUROC), calibration plots, Kaplan-Meier curves, and decision curve analysis.</p><p><strong>Results: </strong>SWEDEHEART-AI demonstrated strong discrimination in predicting 1-year all-cause mortality in patients with AMI, with a higher C-index than Qwen-2 and Llama-3 (C-index 0.72, 95% CI 0.69-0.74 vs C-index 0.65, 0.62-0.67 vs C-index 0.56, 95% CI 0.53-0.58, respectively; all P<.001 for both comparisons). SWEDEHEART-AI also showed high and consistent AUROC in the time-dependent ROC curve. The death rates calculated by SWEDEHEART-AI were positively correlated with actual mortality, and the 3 risk classes derived from this model showed clear differentiation in the Kaplan-Meier curve (P<.001). Calibration plots indicated that SWEDEHEART-AI tended to overestimate mortality risk, with an observed-to-expected ratio of 0.478. Compared with the LLMs, SWEDEHEART-AI demonstrated positive and greater net benefits at risk thresholds below 19%.</p><p><strong>Conclusions: </strong>SWEDEHEART-AI, a trained ANN model, demonstrated the best performance, with strong discrimination and clinical utility in predicting 1-year all-cause mortality in patients with AMI from an intensive care cohort. Among the LLMs, Qwen-2 outperformed Llama-3 and showed moderate predictive value. Qwen-2 and SWEDEHEART-AI exhibited comparable classification effectiveness. The future integration of LLMs into clinical decision support systems holds promis","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e67253"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107198/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144022549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jennifer C Morgan, Sarah Badlis, Katharine J Head, Gregory Zimet, Joseph N Cappella, Melanie L Kornides
{"title":"Types of HPV Vaccine Misinformation Circulating on Twitter (X) That Parents Find Most Concerning: Insights From a Cross-Sectional Survey and Content Analysis.","authors":"Jennifer C Morgan, Sarah Badlis, Katharine J Head, Gregory Zimet, Joseph N Cappella, Melanie L Kornides","doi":"10.2196/54657","DOIUrl":"10.2196/54657","url":null,"abstract":"<p><strong>Background: </strong>Parents frequently use social media as a source of information about the human papillomavirus (HPV) vaccine. Our previous work identified that, on Twitter (now X), almost 25% of tweets about the HPV vaccine contain misinformation, and these tweets receive higher audience engagement than accurate tweets. Exposure to misinformation can increase vaccine hesitancy, but the types of misinformation found on social media vary widely, and not all misinformation exposure influences vaccine attitudes and vaccine uptake. Despite the prevalence of misinformation and antivaccine information on social media, little work has assessed parents' assessments of these posts.</p><p><strong>Objective: </strong>This study examines which types of misinformation on Twitter parents find the most concerning.</p><p><strong>Methods: </strong>In April 2022, we surveyed 263 US parents of children ages 7-10 years using a Qualtrics survey panel. They viewed a first round of 9 randomly selected tweets from a pool of 126 tweets circulating on Twitter that contained misinformation about the HPV vaccine. Then parents selected up to 3 that they found most concerning. The process was repeated once more with 9 selected from the pool of 117 messages not shown in the first round. Using this information, a concern score for each tweet was calculated based on the number of parents who viewed the tweet and selected it as concerning. In total, 2 researchers independently coded the misinformation tweets to identify rhetorical strategies used and health concerns mentioned. Multiple linear regression tested whether tweet content significantly predicted the concern score of the tweet.</p><p><strong>Results: </strong>Parental concern about the different misinformation tweets varied widely, with some misinformation being selected as most concerning just 2.8% of the time it was viewed and other misinformation being selected 79.5% of the time it was viewed. Multiple beta regression analyses found that misinformation tweets using negative emotional appeals (b=.79, P<.001), expressing pharmaceutical company skepticism (b=.36, P=.036), invoking governmental authority (b=.44, P=.02), and mentioning hospitalization (b=1.00, P=.003), paralysis (b=.54, P=.02), and infertility (b=.52, P=.04) significantly increased the percent of parents rating the misinformation tweets as most concerning.</p><p><strong>Conclusions: </strong>Misinformation about HPV vaccination is ubiquitous on social media, and it would be impossible to target and correct all of it. Counter-messaging campaigns and interventions to combat misinformation need to focus on the types of misinformation that concern parents and ultimately may impact vaccine uptake. Results from this study identify the misinformation content that parents find most concerning and provide a useful list of targets for researchers developing interventions to combat misinformation with the goal of increasing HPV vaccine uptake.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e54657"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107203/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143988779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effects of Digital Sleep Interventions on Sleep Among College Students and Young Adults: Systematic Review and Meta-Analysis.","authors":"Yi-An Lu, Hui-Chen Lin, Pei-Shan Tsai","doi":"10.2196/69657","DOIUrl":"10.2196/69657","url":null,"abstract":"<p><strong>Background: </strong>College students and young adults (18-25 years) frequently experience poor sleep quality, with insomnia being particularly prevalent among this population. Given the widespread use of digital devices in the modern world, electronic device-based sleep interventions present a promising solution for improving sleep outcomes. However, their effects in this population remain underexplored.</p><p><strong>Objective: </strong>We aimed to synthesize current evidence on the effectiveness of electronic device-based sleep interventions in enhancing sleep outcomes among college students and young adults.</p><p><strong>Methods: </strong>In total, 5 electronic databases (PubMed, CINAHL, Cochrane Library, Embase, and Web of Science) were searched to identify randomized controlled trials on digital sleep interventions. Sleep interventions, including cognitive behavioral therapy for insomnia, mindfulness, and sleep education programs delivered via web-based platforms or mobile apps, were evaluated for their effects on sleep quality, sleep parameters, and insomnia severity. Pooled estimates of postintervention and follow-up effects were calculated using Hedges g and 95% CIs under a random-effects model. Heterogeneity was assessed with I<sup>2</sup> statistics, and moderator and meta-regression analyses were performed to explore sources of heterogeneity. Evidence quality was evaluated using the Grading of Recommendations Assessment, Development, and Evaluations framework.</p><p><strong>Results: </strong>This study included 13 studies involving 5251 participants. Digital sleep interventions significantly improved sleep quality (Hedges g=-1.25, 95% CI -1.83 to -0.66; I<sup>2</sup>=97%), sleep efficiency (Hedges g=0.62, 95% CI 0.18-1.05; I<sup>2</sup>=60%), insomnia severity (Hedges g=-4.08, 95% CI -5.14 to -3.02; I<sup>2</sup>=99%), dysfunctional beliefs and attitudes about sleep (Hedges g=-1.54, 95% CI -3.33 to -0.99; I<sup>2</sup>=85%), sleep hygiene (Hedges g=-0.19, 95% CI -0.34 to -0.03; I<sup>2</sup>=0%), and sleep knowledge (Hedges g=-0.27, 95% CI 0.09-0.45; I<sup>2</sup>=0%). The follow-up effects were significant for sleep quality (Hedges g=-0.53, 95% CI -0.96 to -0.11; I<sup>2</sup>=78%) and insomnia severity (Hedges g=-2.65, 95% CI -3.89 to -1.41; I<sup>2</sup>=99%). Moderator analyses revealed several significant sources of heterogeneity in the meta-analysis examining the effects of digital sleep interventions on sleep outcomes. Variability in sleep quality was influenced by the sleep assessment tool (P<.001), intervention type and duration (P=.001), therapist guidance (P<.001), delivery mode (P=.002), history of insomnia (P<.001), and the use of intention-to-treat analysis (P=.001). Heterogeneity in insomnia severity was primarily attributed to differences in the sleep assessment tool (P<.001), while the effect size on sleep efficiency varied based on intervention duration (P=.02). The evidence quality ranged from moderate t","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e69657"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107209/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144011183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaxi Luo, Meng Jiao, Neel Fotedar, Jun-En Ding, Ioannis Karakis, Vikram R Rao, Melissa Asmar, Xiaochen Xian, Orwa Aboud, Yuxin Wen, Jack J Lin, Fang-Ming Hung, Hai Sun, Felix Rosenow, Feng Liu
{"title":"Clinical Value of ChatGPT for Epilepsy Presurgical Decision-Making: Systematic Evaluation of Seizure Semiology Interpretation.","authors":"Yaxi Luo, Meng Jiao, Neel Fotedar, Jun-En Ding, Ioannis Karakis, Vikram R Rao, Melissa Asmar, Xiaochen Xian, Orwa Aboud, Yuxin Wen, Jack J Lin, Fang-Ming Hung, Hai Sun, Felix Rosenow, Feng Liu","doi":"10.2196/69173","DOIUrl":"10.2196/69173","url":null,"abstract":"<p><strong>Background: </strong>For patients with drug-resistant focal epilepsy, surgical resection of the epileptogenic zone (EZ) is an effective treatment to control seizures. Accurate localization of the EZ is crucial and is typically achieved through comprehensive presurgical approaches such as seizure semiology interpretation, electroencephalography (EEG), magnetic resonance imaging (MRI), and intracranial EEG (iEEG). However, interpreting seizure semiology is challenging because it heavily relies on expert knowledge. The semiologies are often inconsistent and incoherent, leading to variability and potential limitations in presurgical evaluation. To overcome these challenges, advanced technologies like large language models (LLMs)-with ChatGPT being a notable example-offer valuable tools for analyzing complex textual information, making them well-suited to interpret detailed seizure semiology descriptions and accurately localize the EZ.</p><p><strong>Objective: </strong>This study evaluates the clinical value of ChatGPT for interpreting seizure semiology to localize EZs in presurgical assessments for patients with focal epilepsy and compares its performance with that of epileptologists.</p><p><strong>Methods: </strong>We compiled 2 data cohorts: a publicly sourced cohort of 852 semiology-EZ pairs from 193 peer-reviewed journal publications and a private cohort of 184 semiology-EZ pairs collected from Far Eastern Memorial Hospital (FEMH) in Taiwan. ChatGPT was evaluated to predict the most likely EZ locations using 2 prompt methods: zero-shot prompting (ZSP) and few-shot prompting (FSP). To compare the performance of ChatGPT, 8 epileptologists were recruited to participate in an online survey to interpret 100 randomly selected semiology records. The responses from ChatGPT and epileptologists were compared using 3 metrics: regional sensitivity (RSens), weighted sensitivity (WSens), and net positive inference rate (NPIR).</p><p><strong>Results: </strong>In the publicly sourced cohort, ChatGPT demonstrated high RSens reliability, achieving 80% to 90% for the frontal and temporal lobes; 20% to 40% for the parietal lobe, occipital lobe, and insular cortex; and only 3% for the cingulate cortex. The WSens, which accounts for biased data distribution, consistently exceeded 67%, while the mean NPIR remained around 0. These evaluation results based on the private FEMH cohort are consistent with those from the publicly sourced cohort. A group t test with 1000 bootstrap samples revealed that ChatGPT-4 significantly outperformed epileptologists in RSens for the most frequently implicated EZs, such as the frontal and temporal lobes (P<.001). Additionally, ChatGPT-4 demonstrated superior overall performance in WSens (P<.001). However, no significant differences were observed between ChatGPT and the epileptologists in NPIR, highlighting comparable performance in this metric.</p><p><strong>Conclusions: </strong>ChatGPT demonstrated clinical value as a tool to","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e69173"},"PeriodicalIF":5.8,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12107199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144014613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}