Raphaël Bentegeac, Nans Florens, Mehdi Maanaoui, Valentin Maisons, Antoine Lanot, Mickaël Bobot, Benoît Brilland, François Glowacki, Erwin Gérard, Marc Hazzan, Philippe Amouyel, Bastien Le Guellec, Aghiles Hamroun
{"title":"ECOSBot:一项多中心验证试点研究,用于基于osce的肾脏病学培训的生成人工智能工具。","authors":"Raphaël Bentegeac, Nans Florens, Mehdi Maanaoui, Valentin Maisons, Antoine Lanot, Mickaël Bobot, Benoît Brilland, François Glowacki, Erwin Gérard, Marc Hazzan, Philippe Amouyel, Bastien Le Guellec, Aghiles Hamroun","doi":"10.1093/ckj/sfaf308","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Developing diagnostic reasoning in nephrology is particularly challenging due to its pathophysiological complexity and reliance on abstract clinical data. Objective Structured Clinical Examinations (OSCEs) are pivotal for nephrology training but remain resource-intensive and difficult to scale. Generative artificial intelligence (AI) offers a promising alternative, yet its capacity to emulate nephrology-specific OSCEs has not been formally assessed.</p><p><strong>Methods: </strong>We developed ECOSBot, a web-based tool powered by GPT-4o, to simulate both standardized patients and examiners for nephrology-focused OSCEs. In this multicenter prospective study, undergraduate medical students from five French medical schools interacted with ECOSBot across four clinical stations. All interactions were double-rated by nephrology faculty members to establish a gold standard. ECOSBot's performance was evaluated against this standard using four criteria (script coverage, authenticity, correctness and relevance) for patient simulation, and via checklists and competency-based ratings for examiner scoring. Usability was assessed using the Chatbot Usability Questionnaire (CUQ), adapted to include six items on feedback quality.</p><p><strong>Results: </strong>Ninety-one students generated 2939 prompts across 184 OSCE sessions. ECOSBot demonstrated high fidelity in patient simulation: authenticity 98.6% [95% confidence interval (CI) 98.2-99.0], correctness 98.3% (95% CI 97.9-98.7) and relevance 99.2% (95% CI 98.9-99.5), including during exchanges not explicitly covered by the pre-specified scenario. As an examiner, ECOSBot showed strong agreement with human raters on global scores [intraclass correlation coefficient (ICC) = 0.94, 95% CI 0.91-0.96], consistent across case formats, training levels and institutions. However, scoring of attitude and communication skills was less reliable (ICC = 0.44, 95% CI 0.28-0.58). Median CUQ score was 69.7/100, with 91.7% of students finding the tool highly useful for OSCE preparation in nephrology.</p><p><strong>Conclusions: </strong>ECOSBot reliably simulated both roles in nephrology OSCEs with high fidelity and strong alignment with expert rating. While challenges remain for subjective skill assessment, this tool offers a scalable and autonomous solution to enhance nephrology education.</p>","PeriodicalId":10435,"journal":{"name":"Clinical Kidney Journal","volume":"18 10","pages":"sfaf308"},"PeriodicalIF":4.6000,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12541372/pdf/","citationCount":"0","resultStr":"{\"title\":\"ECOSBot: a multicenter validation pilot study of a generative AI tool for OSCE-based nephrology training.\",\"authors\":\"Raphaël Bentegeac, Nans Florens, Mehdi Maanaoui, Valentin Maisons, Antoine Lanot, Mickaël Bobot, Benoît Brilland, François Glowacki, Erwin Gérard, Marc Hazzan, Philippe Amouyel, Bastien Le Guellec, Aghiles Hamroun\",\"doi\":\"10.1093/ckj/sfaf308\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Developing diagnostic reasoning in nephrology is particularly challenging due to its pathophysiological complexity and reliance on abstract clinical data. Objective Structured Clinical Examinations (OSCEs) are pivotal for nephrology training but remain resource-intensive and difficult to scale. Generative artificial intelligence (AI) offers a promising alternative, yet its capacity to emulate nephrology-specific OSCEs has not been formally assessed.</p><p><strong>Methods: </strong>We developed ECOSBot, a web-based tool powered by GPT-4o, to simulate both standardized patients and examiners for nephrology-focused OSCEs. In this multicenter prospective study, undergraduate medical students from five French medical schools interacted with ECOSBot across four clinical stations. All interactions were double-rated by nephrology faculty members to establish a gold standard. ECOSBot's performance was evaluated against this standard using four criteria (script coverage, authenticity, correctness and relevance) for patient simulation, and via checklists and competency-based ratings for examiner scoring. Usability was assessed using the Chatbot Usability Questionnaire (CUQ), adapted to include six items on feedback quality.</p><p><strong>Results: </strong>Ninety-one students generated 2939 prompts across 184 OSCE sessions. ECOSBot demonstrated high fidelity in patient simulation: authenticity 98.6% [95% confidence interval (CI) 98.2-99.0], correctness 98.3% (95% CI 97.9-98.7) and relevance 99.2% (95% CI 98.9-99.5), including during exchanges not explicitly covered by the pre-specified scenario. As an examiner, ECOSBot showed strong agreement with human raters on global scores [intraclass correlation coefficient (ICC) = 0.94, 95% CI 0.91-0.96], consistent across case formats, training levels and institutions. However, scoring of attitude and communication skills was less reliable (ICC = 0.44, 95% CI 0.28-0.58). Median CUQ score was 69.7/100, with 91.7% of students finding the tool highly useful for OSCE preparation in nephrology.</p><p><strong>Conclusions: </strong>ECOSBot reliably simulated both roles in nephrology OSCEs with high fidelity and strong alignment with expert rating. While challenges remain for subjective skill assessment, this tool offers a scalable and autonomous solution to enhance nephrology education.</p>\",\"PeriodicalId\":10435,\"journal\":{\"name\":\"Clinical Kidney Journal\",\"volume\":\"18 10\",\"pages\":\"sfaf308\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12541372/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Kidney Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1093/ckj/sfaf308\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/10/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"UROLOGY & NEPHROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Kidney Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/ckj/sfaf308","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/10/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"UROLOGY & NEPHROLOGY","Score":null,"Total":0}
引用次数: 0
摘要
背景:发展肾脏学诊断推理是特别具有挑战性的,因为它的病理生理的复杂性和依赖于抽象的临床数据。目的结构化临床检查(OSCEs)是肾脏病学培训的关键,但仍然是资源密集型和难以规模化。生成式人工智能(AI)提供了一个很有前途的替代方案,但其模拟肾病特异性oses的能力尚未得到正式评估。方法:我们开发了ECOSBot,这是一个基于网络的工具,由gpt - 40提供支持,用于模拟以肾脏学为重点的oses的标准化患者和检查员。在这项多中心前瞻性研究中,来自五所法国医学院的本科医学生在四个临床站与ECOSBot进行了互动。所有的互动都由肾脏病学教员进行双重评估,以建立黄金标准。ECOSBot的表现是根据这一标准进行评估的,使用四个标准(脚本覆盖率、真实性、正确性和相关性)来模拟患者,并通过检查表和基于能力的评分来进行评分。可用性评估使用聊天机器人可用性问卷(CUQ),包括六个项目的反馈质量。结果:91名学生在184次欧安组织会议中产生了2939个提示。ECOSBot在患者模拟中表现出高保真度:真实性98.6%[95%置信区间(CI) 98.2-99.0],正确性98.3% (95% CI 99.9 -98.7)和相关性99.2% (95% CI 98.9-99.5),包括在预先指定的场景未明确涵盖的交换期间。作为审查员,ECOSBot在整体评分上与人类评分者表现出强烈的一致性[类内相关系数(ICC) = 0.94, 95% CI 0.91-0.96],在案例格式、培训水平和机构之间保持一致。然而,态度和沟通技巧的评分可信度较低(ICC = 0.44, 95% CI 0.28-0.58)。中位CUQ评分为69.7/100,91.7%的学生发现该工具对肾内科的OSCE准备非常有用。结论:ECOSBot可靠地模拟了肾病oses中的两个角色,具有高保真度和与专家评级的高度一致性。虽然主观技能评估仍然存在挑战,但该工具提供了可扩展和自主的解决方案,以加强肾脏学教育。
ECOSBot: a multicenter validation pilot study of a generative AI tool for OSCE-based nephrology training.
Background: Developing diagnostic reasoning in nephrology is particularly challenging due to its pathophysiological complexity and reliance on abstract clinical data. Objective Structured Clinical Examinations (OSCEs) are pivotal for nephrology training but remain resource-intensive and difficult to scale. Generative artificial intelligence (AI) offers a promising alternative, yet its capacity to emulate nephrology-specific OSCEs has not been formally assessed.
Methods: We developed ECOSBot, a web-based tool powered by GPT-4o, to simulate both standardized patients and examiners for nephrology-focused OSCEs. In this multicenter prospective study, undergraduate medical students from five French medical schools interacted with ECOSBot across four clinical stations. All interactions were double-rated by nephrology faculty members to establish a gold standard. ECOSBot's performance was evaluated against this standard using four criteria (script coverage, authenticity, correctness and relevance) for patient simulation, and via checklists and competency-based ratings for examiner scoring. Usability was assessed using the Chatbot Usability Questionnaire (CUQ), adapted to include six items on feedback quality.
Results: Ninety-one students generated 2939 prompts across 184 OSCE sessions. ECOSBot demonstrated high fidelity in patient simulation: authenticity 98.6% [95% confidence interval (CI) 98.2-99.0], correctness 98.3% (95% CI 97.9-98.7) and relevance 99.2% (95% CI 98.9-99.5), including during exchanges not explicitly covered by the pre-specified scenario. As an examiner, ECOSBot showed strong agreement with human raters on global scores [intraclass correlation coefficient (ICC) = 0.94, 95% CI 0.91-0.96], consistent across case formats, training levels and institutions. However, scoring of attitude and communication skills was less reliable (ICC = 0.44, 95% CI 0.28-0.58). Median CUQ score was 69.7/100, with 91.7% of students finding the tool highly useful for OSCE preparation in nephrology.
Conclusions: ECOSBot reliably simulated both roles in nephrology OSCEs with high fidelity and strong alignment with expert rating. While challenges remain for subjective skill assessment, this tool offers a scalable and autonomous solution to enhance nephrology education.
期刊介绍:
About the Journal
Clinical Kidney Journal: Clinical and Translational Nephrology (ckj), an official journal of the ERA-EDTA (European Renal Association-European Dialysis and Transplant Association), is a fully open access, online only journal publishing bimonthly. The journal is an essential educational and training resource integrating clinical, translational and educational research into clinical practice. ckj aims to contribute to a translational research culture among nephrologists and kidney pathologists that helps close the gap between basic researchers and practicing clinicians and promote sorely needed innovation in the Nephrology field. All research articles in this journal have undergone peer review.