DIALOGUE: A Generative AI-Based Pre-Post Simulation Study to Enhance Diagnostic Communication in Medical Students Through Virtual Type 2 Diabetes Scenarios.
Ricardo Xopan Suárez-García, Quetzal Chavez-Castañeda, Rodrigo Orrico-Pérez, Sebastián Valencia-Marin, Ari Evelyn Castañeda-Ramírez, Efrén Quiñones-Lara, Claudio Adrián Ramos-Cortés, Areli Marlene Gaytán-Gómez, Jonathan Cortés-Rodríguez, Jazel Jarquín-Ramírez, Nallely Guadalupe Aguilar-Marchand, Graciela Valdés-Hernández, Tomás Eduardo Campos-Martínez, Alonso Vilches-Flores, Sonia Leon-Cabrera, Adolfo René Méndez-Cruz, Brenda Ofelia Jay-Jímenez, Héctor Iván Saldívar-Cerón
{"title":"DIALOGUE: A Generative AI-Based Pre-Post Simulation Study to Enhance Diagnostic Communication in Medical Students Through Virtual Type 2 Diabetes Scenarios.","authors":"Ricardo Xopan Suárez-García, Quetzal Chavez-Castañeda, Rodrigo Orrico-Pérez, Sebastián Valencia-Marin, Ari Evelyn Castañeda-Ramírez, Efrén Quiñones-Lara, Claudio Adrián Ramos-Cortés, Areli Marlene Gaytán-Gómez, Jonathan Cortés-Rodríguez, Jazel Jarquín-Ramírez, Nallely Guadalupe Aguilar-Marchand, Graciela Valdés-Hernández, Tomás Eduardo Campos-Martínez, Alonso Vilches-Flores, Sonia Leon-Cabrera, Adolfo René Méndez-Cruz, Brenda Ofelia Jay-Jímenez, Héctor Iván Saldívar-Cerón","doi":"10.3390/ejihpe15080152","DOIUrl":null,"url":null,"abstract":"<p><p>DIALOGUE (DIagnostic AI Learning through Objective Guided User Experience) is a generative artificial intelligence (GenAI)-based training program designed to enhance diagnostic communication skills in medical students. In this single-arm pre-post study, we evaluated whether DIALOGUE could improve students' ability to disclose a type 2 diabetes mellitus (T2DM) diagnosis with clarity, structure, and empathy. Thirty clinical-phase students completed two pre-test virtual encounters with an AI-simulated patient (ChatGPT, GPT-4o), scored by blinded raters using an eight-domain rubric. Participants then engaged in ten asynchronous GenAI scenarios with automated natural-language feedback. Seven days later, they completed two post-test consultations with human standardized patients, again evaluated with the same rubric. Mean total performance increased by 36.7 points (95% CI: 31.4-42.1; <i>p</i> < 0.001), and the proportion of high-performing students rose from 0% to 70%. Gains were significant across all domains, most notably in opening the encounter, closure, and diabetes specific explanation. Multiple regression showed that lower baseline empathy (β = -0.41, <i>p</i> = 0.005) and higher digital self-efficacy (β = 0.35, <i>p</i> = 0.016) independently predicted greater improvement; gender had only a marginal effect. Cluster analysis revealed three learner profiles, with the highest-gain group characterized by low empathy and high digital self-efficacy. Inter-rater reliability was excellent (ICC ≈ 0.90). These findings provide empirical evidence that GenAI-mediated training can meaningfully enhance diagnostic communication and may serve as a scalable, individualized adjunct to conventional medical education.</p>","PeriodicalId":30631,"journal":{"name":"European Journal of Investigation in Health Psychology and Education","volume":"15 8","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2025-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12385505/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Investigation in Health Psychology and Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ejihpe15080152","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, CLINICAL","Score":null,"Total":0}
引用次数: 0
Abstract
DIALOGUE (DIagnostic AI Learning through Objective Guided User Experience) is a generative artificial intelligence (GenAI)-based training program designed to enhance diagnostic communication skills in medical students. In this single-arm pre-post study, we evaluated whether DIALOGUE could improve students' ability to disclose a type 2 diabetes mellitus (T2DM) diagnosis with clarity, structure, and empathy. Thirty clinical-phase students completed two pre-test virtual encounters with an AI-simulated patient (ChatGPT, GPT-4o), scored by blinded raters using an eight-domain rubric. Participants then engaged in ten asynchronous GenAI scenarios with automated natural-language feedback. Seven days later, they completed two post-test consultations with human standardized patients, again evaluated with the same rubric. Mean total performance increased by 36.7 points (95% CI: 31.4-42.1; p < 0.001), and the proportion of high-performing students rose from 0% to 70%. Gains were significant across all domains, most notably in opening the encounter, closure, and diabetes specific explanation. Multiple regression showed that lower baseline empathy (β = -0.41, p = 0.005) and higher digital self-efficacy (β = 0.35, p = 0.016) independently predicted greater improvement; gender had only a marginal effect. Cluster analysis revealed three learner profiles, with the highest-gain group characterized by low empathy and high digital self-efficacy. Inter-rater reliability was excellent (ICC ≈ 0.90). These findings provide empirical evidence that GenAI-mediated training can meaningfully enhance diagnostic communication and may serve as a scalable, individualized adjunct to conventional medical education.