{"title":"Correction: Relationship Between Stroke Knowledge, Health Information Literacy, and Health Self- Management Among Patients With Stroke: Multicenter Cross-Sectional Study.","authors":"Mengxue Zeng, Yanhua Liu, Ying He, Wenxia Huang","doi":"10.2196/80547","DOIUrl":"10.2196/80547","url":null,"abstract":"","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e80547"},"PeriodicalIF":3.8,"publicationDate":"2025-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12306841/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144746287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pharmacoepidemiologic Research Based on Common Data Models: Systematic Review and Bibliometric Analysis.","authors":"Yongqi Zheng, Meng Zhang, Conghui Wang, Ling Gao, Junqing Xie, Peng Shen, Yexiang Sun, Mengling Feng, Seng Chan You, Feng Sun","doi":"10.2196/72225","DOIUrl":"10.2196/72225","url":null,"abstract":"<p><strong>Background: </strong>The adoption of common data models (CDMs) has transformed pharmacoepidemiologic research by enabling standardized data formatting and shared analytical tools across institutions. These models facilitate large-scale, multicenter studies and support timely real-world evidence generation. However, no comprehensive global evaluation of CDM applications in pharmacoepidemiology has been conducted.</p><p><strong>Objective: </strong>This study aimed to conduct a systematic review and bibliometric analysis to map the landscape of CDM usage in pharmacoepidemiology, including publication trends, institutional authors and collaborations, and citation impacts.</p><p><strong>Methods: </strong>In total, 5 English databases (PubMed, Web of Science, Embase, Scopus, and Virtual Health Library) and 4 Chinese databases (CNKI, Wan-Fang Data, VIP, and SinoMed) were searched for studies applying CDMs in pharmacoepidemiology from database inception to January 2024. Two reviewers independently screened studies and extracted information about basic publication details, methodological details, and exposure and outcome information. The studies were categorized into 2 groups according to their Total Citations per Year (TCpY), and a comparative analysis was conducted to examine the differences in characteristics between the 2 groups.</p><p><strong>Results: </strong>A total of 308 studies published between 1997 and 2024 were included, involving 1580 authors across 32 countries and 140 journals. The United States led in both publication volume and citation counts, followed by South Korea. Among the 10 most cited studies, 7 used the Vaccine Safety Datalink, 2 used Sentinel, and one used Observational Medical Outcomes Partnership. Studies were stratified by TCpY to reduce citation bias from publication timing. Comparative analysis showed that high-TCpY studies were significantly more associated with multicenter collaboration (P=.008), United States-based institutions (P=.04), and vaccine-related research (P=.009). These studies commonly featured larger sample sizes, cross-regional data, and enhanced generalizability. International collaborations primarily occurred among North America, Europe, and East Asia, with limited involvement from limited-income countries.</p><p><strong>Conclusions: </strong>This study presents the first bibliometric overview of CDM-based pharmacoepidemiologic research. The consistent output from United States institutions and increasing engagement from South Korea underscore their central roles in this field. High-TCpY studies tend to be multicenter, collaborative, and vaccine-focused, reflecting structural factors linked to research visibility and influence. Stratified citation analysis supports the value of real-world data integration and international cooperation in producing impactful studies. The dominance of limited-income countries in collaboration networks highlights a need for broader inclusion of underrepresented r","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e72225"},"PeriodicalIF":3.8,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12303556/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144735760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatih Dolu, Oğuzhan Fatih Ay, Aydın Hakan Kupeli, Enes Karademir, Muhammed Huseyin Büyükavcı
{"title":"Evaluation of ChatGPT-4 as an Online Outpatient Assistant in Puerperal Mastitis Management: Content Analysis of an Observational Study.","authors":"Fatih Dolu, Oğuzhan Fatih Ay, Aydın Hakan Kupeli, Enes Karademir, Muhammed Huseyin Büyükavcı","doi":"10.2196/68980","DOIUrl":"10.2196/68980","url":null,"abstract":"<p><strong>Background: </strong>The integration of artificial intelligence (AI) into clinical workflows holds promise for enhancing outpatient decision-making and patient education. ChatGPT, a large language model developed by OpenAI, has gained attention for its potential to support both clinicians and patients. However, its performance in the outpatient setting of general surgery remains underexplored.</p><p><strong>Objective: </strong>This study aimed to evaluate whether ChatGPT-4 can function as a virtual outpatient assistant in the management of puerperal mastitis by assessing the accuracy, clarity, and clinical safety of its responses to frequently asked patient questions in Turkish.</p><p><strong>Methods: </strong>Fifteen questions about puerperal mastitis were sourced from public health care websites and online forums. These questions were categorized into general information (n=2), symptoms and diagnosis (n=6), treatment (n=2), and prognosis (n=5). Each question was entered into ChatGPT-4 (September 3, 2024), and a single Turkish-language response was obtained. The responses were evaluated by a panel consisting of 3 board-certified general surgeons and 2 general surgery residents, using five criteria: sufficient length, patient-understandable language, accuracy, adherence to current guidelines, and patient safety. Quantitative metrics included the DISCERN score, Flesch-Kincaid readability score, and inter-rater reliability assessed using the intraclass correlation coefficient (ICC).</p><p><strong>Results: </strong>A total of 15 questions were evaluated. ChatGPT's responses were rated as \"excellent\" overall by the evaluators, with higher scores observed for treatment- and prognosis-related questions. A statistically significant difference was found in DISCERN scores across question types (P=.01), with treatment and prognosis questions receiving higher ratings. In contrast, no significant differences were detected in evaluator-based ratings (sufficient length, understandability, accuracy, guideline compliance, and patient safety), JAMA benchmark scores, or Flesch-Kincaid readability levels (P>.05 for all). Interrater agreement was good across all evaluation parameters (ICC=0.772); however, agreement varied when assessed by individual criteria. Correlation analyses revealed no significant overall associations between subjective ratings and objective quality measures, although a strong positive correlation between literature compliance and patient safety was identified for one question (r=0.968, P<.001).</p><p><strong>Conclusions: </strong>ChatGPT demonstrated adequate capability in providing information on puerperal mastitis, particularly for treatment and prognosis. However, evaluator variability and the subjective nature of assessments highlight the need for further optimization of AI tools. Future research should emphasize iterative questioning and dynamic updates to AI knowledge bases to enhance reliability and accessibility.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68980"},"PeriodicalIF":3.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288767/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144710013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deyi Li, Aditi Shukla, Sravani Chandaka, Bradley Taylor, Jie Xu, Mei Liu
{"title":"Autoencoder-Based Representation Learning for Similar Patients Retrieval From Electronic Health Records: Comparative Study.","authors":"Deyi Li, Aditi Shukla, Sravani Chandaka, Bradley Taylor, Jie Xu, Mei Liu","doi":"10.2196/68830","DOIUrl":"10.2196/68830","url":null,"abstract":"<p><strong>Background: </strong>By analyzing electronic health record snapshots of similar patients, physicians can proactively predict disease onsets, customize treatment plans, and anticipate patient-specific trajectories. However, the modeling of electronic health record data is inherently challenging due to its high dimensionality, mixed feature types, noise, bias, and sparsity. Patient representation learning using autoencoders (AEs) presents promising opportunities to address these challenges. A critical question remains: how do different AE designs and distance measures impact the quality of retrieved similar patient cohorts?</p><p><strong>Objective: </strong>This study aims to evaluate the performance of 5 common AE variants-vanilla autoencoder, denoising autoencoder, contractive autoencoder, sparse autoencoder, and robust autoencoder-in retrieving similar patients. Additionally, it investigates the impact of different distance measures and hyperparameter configurations on model performance.</p><p><strong>Methods: </strong>We tested the 5 AE variants on 2 real-world datasets-the University of Kansas Medical Center (n=13,752) and the Medical College of Wisconsin (n=9568)-across 168 different hyperparameter configurations. To retrieve similar patients based on the AE-produced latent representations, we applied k-nearest neighbors (k-NN) using Euclidean and Mahalanobis distances. Two prediction targets were evaluated: acute kidney injury onset and postdischarge 1-year mortality.</p><p><strong>Results: </strong>Our findings demonstrate that (1) denoising autoencoders outperformed other AE variants when paired with Euclidean distance (P<.001), followed by vanilla autoencoders and contractive autoencoders; (2) learning rates significantly influenced the performance of AE variants; and (3) Mahalanobis distance-based k-NN frequently outperformed Euclidean distance-based k-NN when applied to latent representations. However, whether AE models are superior in transforming raw data into latent representations, compared with applying Mahalanobis distance-based k-NN directly to raw data, appears to be data-dependent.</p><p><strong>Conclusions: </strong>This study provides a comprehensive analysis of the performance of different AE variants in retrieving similar patients and evaluates the impact of various hyperparameter configurations on model performance. The findings lay the groundwork for future development of AE-based patient similarity estimation and personalized medicine.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68830"},"PeriodicalIF":3.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12289314/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144710011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mahshad Koohi Habibi Dehkordi, Yehoshua Perl, Fadi P Deek, Zhe He, Vipina K Keloth, Hao Liu, Gai Elhanan, Andrew J Einstein
{"title":"Improving Large Language Models' Summarization Accuracy by Adding Highlights to Discharge Notes: Comparative Evaluation.","authors":"Mahshad Koohi Habibi Dehkordi, Yehoshua Perl, Fadi P Deek, Zhe He, Vipina K Keloth, Hao Liu, Gai Elhanan, Andrew J Einstein","doi":"10.2196/66476","DOIUrl":"10.2196/66476","url":null,"abstract":"<p><strong>Background: </strong>The American Medical Association recommends that electronic health record (EHR) notes, often dense and written in nuanced language, be made readable for patients and laypeople, a practice we refer to as the simplification of discharge notes. Our approach to achieving the simplification of discharge notes involves a process of incremental simplification steps to achieve the ideal note. In this paper, we present the first step of this process. Large language models (LLMs) have demonstrated considerable success in text summarization. Such LLM summaries represent the content of EHR notes in an easier-to-read language. However, LLM summaries can also introduce inaccuracies.</p><p><strong>Objective: </strong>This study aims to test the hypothesis that summaries generated by LLMs from highlighted discharge notes will achieve increased accuracy compared to those generated from the original notes. For this purpose, we aim to prove a hypothesis that summaries generated by LLMs of discharge notes in which detailed information is highlighted are likely to be more accurate than summaries of the original notes.</p><p><strong>Methods: </strong>To test our hypothesis, we randomly sampled 15 discharge notes from the MIMIC III database and highlighted their detailed information using an interface terminology we previously developed with machine learning. This interface terminology was curated to encompass detailed information from the discharge notes. The highlighted discharge notes distinguished detailed information, specifically the concepts present in the aforementioned interface terminology, by applying a blue background. To calibrate the LLMs' summaries for our simplification goal, we chose GPT-4o and used prompt engineering to ensure high-quality prompts and address issues of output inconsistency and prompt sensitivity. We provided both highlighted and unhighlighted versions of each EHR note along with their corresponding prompts to GPT-4o. Each generated summary was manually evaluated to assess its quality using the following evaluation metrics: completeness, correctness, and structural integrity.</p><p><strong>Results: </strong>We used the study sample of 15 discharge notes. On average, summaries from highlighted notes (H-summaries) achieved 96% completeness, 8% higher than the summaries from unhighlighted notes (U-summaries). H-summaries had higher completeness in 13 notes, and U-summaries had higher or equal completeness in 2 notes, resulting in P=.01, which implied statistical significance. Moreover, H-summaries demonstrated better correctness than U-summaries, with fewer instances of erroneous information (2 vs 3 errors, respectively). The number of improper headers was smaller for H-summaries for 11 notes and U-summaries for 4 notes (P=.03; implying statistical significance). Moreover, we identified 8 instances of misplaced information in the U-summaries and only 2 in the H-summaries. We showed that our findings support","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e66476"},"PeriodicalIF":3.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12332456/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhe Wang, Keqian Li, Suyuan Peng, Lihong Liu, Xiaolin Yang, Keyu Yao, Heinrich Herre, Yan Zhu
{"title":"A Weighted Voting Approach for Traditional Chinese Medicine Formula Classification Using Large Language Models: Algorithm Development and Validation Study.","authors":"Zhe Wang, Keqian Li, Suyuan Peng, Lihong Liu, Xiaolin Yang, Keyu Yao, Heinrich Herre, Yan Zhu","doi":"10.2196/69286","DOIUrl":"10.2196/69286","url":null,"abstract":"<p><strong>Background: </strong>Several clinical cases and experiments have demonstrated the effectiveness of traditional Chinese medicine (TCM) formulas in treating and preventing diseases. These formulas contain critical information about their ingredients, efficacy, and indications. Classifying TCM formulas based on this information can effectively standardize TCM formulas management, support clinical and research applications, and promote the modernization and scientific use of TCM. To further advance this task, TCM formulas can be classified using various approaches, including manual classification, machine learning, and deep learning. Additionally, large language models (LLMs) are gaining prominence in the biomedical field. Integrating LLMs into TCM research could significantly enhance and accelerate the discovery of TCM knowledge by leveraging their advanced linguistic understanding and contextual reasoning capabilities.</p><p><strong>Objective: </strong>The objective of this study is to evaluate the performance of different LLMs in the TCM formula classification task. Additionally, by employing ensemble learning with multiple fine-tuned LLMs, this study aims to enhance classification accuracy.</p><p><strong>Methods: </strong>The data for the TCM formula were manually refined and cleaned. We selected 10 LLMs that support Chinese for fine-tuning. We then employed an ensemble learning approach that combined the predictions of multiple models using both hard and weighted voting, with weights determined by the average accuracy of each model. Finally, we selected the top 5 most effective models from each series of LLMs for weighted voting (top 5) and the top 3 most accurate models of 10 for weighted voting (top 3).</p><p><strong>Results: </strong>A total of 2441 TCM formulas were curated manually from multiple sources, including the Coding Rules for Chinese Medicinal Formulas and Their Codes, the Chinese National Medical Insurance Catalog for proprietary Chinese medicines, textbooks of TCM formulas, and TCM literature. The dataset was divided into a training set of 1999 TCM formulas and test set of 442 TCM formulas. The testing results showed that Qwen-14B achieved the highest accuracy of 75.32% among the single models. The accuracy rates for hard voting, weighted voting, weighted voting (top 5), and weighted voting (top 3) were 75.79%, 76.47%, 75.57%, and 77.15%, respectively.</p><p><strong>Conclusions: </strong>This study aims to explore the effectiveness of LLMs in the TCM formula classification task. To this end, we propose an ensemble learning method that integrates multiple fine-tuned LLMs through a voting mechanism. This method not only improves classification accuracy but also enhances the existing classification system for classifying the efficacy of TCM formula.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e69286"},"PeriodicalIF":3.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12292024/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144710010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lulu Li, Pengqiang Du, Xiaojing Huang, Hongwei Zhao, Ming Ni, Meng Yan, Aifeng Wang
{"title":"Comparative Analysis of Generative Artificial Intelligence Systems in Solving Clinical Pharmacy Problems: Mixed Methods Study.","authors":"Lulu Li, Pengqiang Du, Xiaojing Huang, Hongwei Zhao, Ming Ni, Meng Yan, Aifeng Wang","doi":"10.2196/76128","DOIUrl":"10.2196/76128","url":null,"abstract":"<p><strong>Background: </strong>Generative artificial intelligence (AI) systems are increasingly deployed in clinical pharmacy; yet, systematic evaluation of their efficacy, limitations, and risks across diverse practice scenarios remains limited.</p><p><strong>Objective: </strong>This study aims to quantitatively evaluate and compare the performance of 8 mainstream generative AI systems across 4 core clinical pharmacy scenarios-medication consultation, medication education, prescription review, and case analysis with pharmaceutical care-using a multidimensional framework.</p><p><strong>Methods: </strong>Forty-eight clinically validated questions were selected via stratified sampling from real-world sources (eg, hospital consultations, clinical case banks, and national pharmacist training databases). Three researchers simultaneously tested 8 different generative AI systems (ERNIE Bot, Doubao, Kimi, Qwen, GPT-4o, Gemini-1.5-Pro, Claude-3.5-Sonnet, and DeepSeek-R1) using standardized prompts within a single day (February 20, 2025). A double-blind scoring design was used, with 6 experienced clinical pharmacists (≥5 years experience) evaluating the AI responses across 6 dimensions: accuracy, rigor, applicability, logical coherence, conciseness, and universality, scored 0-10 per predefined criteria (eg, -3 for inaccuracy and -2 for incomplete rigor). Statistical analysis used one-way ANOVA with Tukey Honestly Significant Difference (HSD) post hoc testing and intraclass correlation coefficients (ICC) for interrater reliability (2-way random model). Qualitative thematic analysis identified recurrent errors and limitations.</p><p><strong>Results: </strong>DeepSeek-R1 (DeepSeek) achieved the highest overall performance (mean composite score: medication consultation 9.4, SD 1.0; case analysis 9.3, SD 1.0), significantly outperforming others in complex tasks (P<.05). Critical limitations were observed across models, including high-risk decision errors-75% omitted critical contraindications (eg, ethambutol in optic neuritis) and a lack of localization-90% erroneously recommended macrolides for drug-resistant Mycoplasma pneumoniae (China's high-resistance setting), while only DeepSeek-R1 aligned with updated American Academy of Pediatrics (AAP) guidelines for pediatric doxycycline. Complex reasoning deficits: only Claude-3.5-Sonnet detected a gender-diagnosis contradiction (prostatic hyperplasia in female); no model identified diazepam's 7-day prescription limit. Interrater consistency was lowest for conciseness in case analysis (ICC=0.70), reflecting evaluator disagreement on complex outputs. ERNIE Bot (Baidu) consistently underperformed (case analysis: 6.8, SD 1.5; P<.001 vs DeepSeek-R1).</p><p><strong>Conclusions: </strong>While generative AI shows promise as a pharmacist assistance tool, significant limitations-including high-risk errors (eg, contraindication omissions), inadequate localization, and complex reasoning gaps-preclude autonomous clinical deci","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e76128"},"PeriodicalIF":3.8,"publicationDate":"2025-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12288765/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144710012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ahlam Alghamdi, Afrah Alkazemi, Alnada Ibrahim, Mohammed Alraey, Mohammed Alaboud, Isra Farooqi, Mohammad Aatif Khan, Asem Allam, Mohammed Alwadai, Renad Alyahya, Ohoud Alzahrani, Hajar Y AlQahtani, Amir Mohareb, Muneerah Aleissa
{"title":"Impact of Diagnostic Stewardship on Urine Culture Ordering in Saudi Arabia: Prospective Pre- and Postintervention Study.","authors":"Ahlam Alghamdi, Afrah Alkazemi, Alnada Ibrahim, Mohammed Alraey, Mohammed Alaboud, Isra Farooqi, Mohammad Aatif Khan, Asem Allam, Mohammed Alwadai, Renad Alyahya, Ohoud Alzahrani, Hajar Y AlQahtani, Amir Mohareb, Muneerah Aleissa","doi":"10.2196/68044","DOIUrl":"10.2196/68044","url":null,"abstract":"<p><strong>Background: </strong>Inappropriate testing of urine cultures can lead to overuse of antibiotics, antimicrobial resistance, Clostridioides difficile infections, and increased cost. In Saudi Arabia, antimicrobial stewardship programs have improved antibiotic use but lack focus on asymptomatic bacteriuria. Targeted interventions are needed to address this gap.</p><p><strong>Objective: </strong>We assessed the implementation of a clinical decision support (CDS) tool in diagnostic stewardship, focusing on the appropriateness of urine culture orders and antibiotic use.</p><p><strong>Methods: </strong>We examined differences in urine culture testing and antibiotic use before and after implementation of a CDS tool in a 400-bed hospital in Riyadh, Saudi Arabia, from August 2021 to July 2022. We included adult patients with urine culture orders. Our outcomes were the percentage of urine cultures ordered that were inappropriate and antibiotic use after the implementation of the CDS intervention. We used a multivariable logistic regression model to determine factors associated with inappropriate urine culture testing and antibiotic use.</p><p><strong>Results: </strong>The percentage of inappropriate urine culture orders were significantly lower in the postintervention period compared to the preintervention period (821/2254, 36.4% vs 754/1814, 41.6%; P=.001). The CDS intervention was associated with 16.7% lower odds of inappropriate urine culture ordering (adjusted odds ratio [aOR] 0.83, 95% CI 0.73-0.95; P=.008). Unnecessary antibiotics were significantly lower in the postintervention period (310/2254, 72.9% vs 288/1814, 85.7%; P<.001). The CDS intervention was associated with a 52% reduction in unnecessary antibiotic use (aOR 0.487, 95% CL 0.332-0.713; P<.001).</p><p><strong>Conclusions: </strong>A CDS initiative can reduce unnecessary urine culture testing and antibiotic overuse.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68044"},"PeriodicalIF":3.8,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12309619/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144700459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ChulHyoung Park, Min Ho An, Gyubeom Hwang, Rae Woong Park, Juho An
{"title":"Clinical Performance and Communication Skills of ChatGPT Versus Physicians in Emergency Medicine: Simulated Patient Study.","authors":"ChulHyoung Park, Min Ho An, Gyubeom Hwang, Rae Woong Park, Juho An","doi":"10.2196/68409","DOIUrl":"10.2196/68409","url":null,"abstract":"<p><strong>Background: </strong>Emergency medicine can benefit from artificial intelligence (AI) due to its unique challenges, such as high patient volume and the need for urgent interventions. However, it remains difficult to assess the applicability of AI systems to real-world emergency medicine practice, which requires not only medical knowledge but also adaptable problem-solving and effective communication skills.</p><p><strong>Objective: </strong>We aimed to evaluate ChatGPT's (OpenAI) performance in comparison to human doctors in simulated emergency medicine settings, using the framework of clinical performance examination and written examinations.</p><p><strong>Methods: </strong>In total, 12 human doctors were recruited to represent the medical professionals. Both ChatGPT and the human doctors were instructed to manage each case like real clinical settings with 12 simulated patients. After the clinical performance examination sessions, the conversation records were evaluated by an emergency medicine professor on history taking, clinical accuracy, and empathy on a 5-point Likert scale. Simulated patients completed a 5-point scale survey including overall comprehensibility, credibility, and concern reduction for each case. In addition, they evaluated whether the doctor they interacted with was similar to a human doctor. An additional evaluation was performed using vignette-based written examinations to assess diagnosis, investigation, and treatment planning. The mean scores from ChatGPT were then compared with those of the human doctors.</p><p><strong>Results: </strong>ChatGPT scored significantly higher than the physicians in both history-taking (mean score 3.91, SD 0.67 vs mean score 2.67, SD 0.78, P<.001) and empathy (mean score 4.50, SD 0.67 vs mean score 1.75, SD 0.62, P<.001). However, there was no significant difference in clinical accuracy. In the survey conducted with simulated patients, ChatGPT scored higher for concern reduction (mean score 4.33, SD 0.78 vs mean score 3.58, SD 0.90, P=.04). For comprehensibility and credibility, ChatGPT showed better performance, but the difference was not significant. In the similarity assessment score, no significant difference was observed (mean score 3.50, SD 1.78 vs mean score 3.25, SD 1.86, P=.71).</p><p><strong>Conclusions: </strong>ChatGPT's performance highlights its potential as a valuable adjunct in emergency medicine, demonstrating comparable proficiency in knowledge application, efficiency, and empathetic patient interaction. These results suggest that a collaborative health care model, integrating AI with human expertise, could enhance patient care and outcomes.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e68409"},"PeriodicalIF":3.1,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12289221/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144661131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detection and Analysis of Circadian Biomarkers for Metabolic Syndrome Using Wearable Data: Cross-Sectional Study.","authors":"Jeong-Kyun Kim, Sujeong Mun, Siwoo Lee","doi":"10.2196/69328","DOIUrl":"10.2196/69328","url":null,"abstract":"<p><strong>Background: </strong>Wearable devices are increasingly used for monitoring health and detecting digital biomarkers related to chronic diseases such as metabolic syndrome (MetS). Although circadian rhythm disturbances are known to contribute to MetS, few studies have explored wearable-derived circadian biomarkers for MetS identification.</p><p><strong>Objective: </strong>This study aimed to detect and analyze sleep and circadian rhythm biomarkers associated with MetS using step count and heart rate data from wearable devices and to identify the key biomarkers using explainable artificial intelligence (XAI).</p><p><strong>Methods: </strong>Data were analyzed from 272 participants in the Korean Medicine Daejeon Citizen Cohort, collected between 2020 and 2023, including 88 participants with MetS and 184 without any MetS diagnostic criteria. Participants wore Fitbit Versa or Inspire 2 devices for at least 5 weekdays, providing minute-level heart rate, step count, and sleep data. A total of 26 indicators were derived, including sleep markers (midsleep time and total sleep time) and circadian rhythm markers (midline estimating statistic of rhythm, amplitude, interdaily stability, and relative amplitude). In addition, a novel circadian rhythm marker, continuous wavelet circadian rhythm energy (CCE), was proposed using continuous wavelet transform of heart rate signals. Statistical tests (t test and the Wilcoxon rank sum test) and machine learning models-Shapley Additive Explanations, explainable boosting machine, and tabular neural network-were applied to evaluate marker significance and importance.</p><p><strong>Results: </strong>Circadian rhythm markers, especially heart rate-based indicators, showed stronger associations with MetS than sleep markers. The newly proposed CCE demonstrated the highest importance for MetS identification across all XAI models, with significantly lower values observed in the MetS group (P<.001). Other heart rate-based markers, including relative amplitude and low activity period, were also identified as important contributors. Although sleep markers did not reach statistical significance, some were recognized as secondary predictors in XAI-based analyses. The CCE marker maintained a high predictive value even when adjusting for age, sex, and BMI.</p><p><strong>Conclusions: </strong>This study identified CCE and relative amplitude of heart rate as key circadian rhythm biomarkers for MetS monitoring, demonstrating their high importance across multiple XAI models. In contrast, traditional sleep markers showed limited significance, suggesting that circadian rhythm analysis may offer additional insights into MetS beyond sleep-related indicators. These findings highlight the potential of wearable-based circadian biomarkers for improving MetS assessment and management.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e69328"},"PeriodicalIF":3.8,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12311872/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144651296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}