回声智能：在心脏病学教育中使用大型语言模型评估一种新型生成人工智能的能力

IF 2.5 Q2 CARDIAC & CARDIOVASCULAR SYSTEMS

CJC Open Pub Date : 2025-06-01 DOI:10.1016/j.cjco.2025.03.013

Muneeb Ahmed MD , Flora Huang MD, FRCPC , Chi-Ming Chow MD, MSc, FRCPC, FCCS, FASE

{"title":"回声智能：在心脏病学教育中使用大型语言模型评估一种新型生成人工智能的能力","authors":"Muneeb Ahmed MD , Flora Huang MD, FRCPC , Chi-Ming Chow MD, MSc, FRCPC, FCCS, FASE","doi":"10.1016/j.cjco.2025.03.013","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Artificial intelligence (AI) holds promise for enhancing medical education, particularly in complex fields like cardiology. We assessed the ability of a large language model (LLM) to generate and evaluate educational material comparable to that created by human experts.</div></div><div><h3>Methods</h3><div>We trained an AI model on cardiology-specific content using 80 lectures from the St. Michael’s Hospital Virtual Echo Rounds. The AI generated 10 multiple-choice questions (MCQs), and experienced cardiologists crafted an additional 10 MCQs. Eleven postgraduate year 4-6 cardiology trainees answered all 20 questions and attempted to identify the source (AI or human) of each question. The AI also answered the same set of questions. We analyzed performance using the Wilcoxon signed-rank test and recognition ability.</div></div><div><h3>Results</h3><div>Trainees scored similarly on AI-generated and human-generated questions (median 8/10 vs 8/10; <em>P</em> > 0.05). Their ability to identify the source of questions did not exceed chance levels (median correct identifications: 10/20; <em>P</em> > 0.05). The AI achieved 95% accuracy on AI-generated questions and 100% on human-generated questions.</div></div><div><h3>Conclusions</h3><div>The AI-generated educational content was of comparable quality to that produced by human experts, and trainees could not reliably distinguish between the 2 sources. Our findings suggest that AI could significantly augment cardiology education by providing high-quality, scalable learning resources.</div></div>","PeriodicalId":36924,"journal":{"name":"CJC Open","volume":"7 6","pages":"Pages 795-798"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Echoing Intelligence: Assessing the Capabilities of a Novel Generative AI With a Large Language Model in Cardiology Education\",\"authors\":\"Muneeb Ahmed MD , Flora Huang MD, FRCPC , Chi-Ming Chow MD, MSc, FRCPC, FCCS, FASE\",\"doi\":\"10.1016/j.cjco.2025.03.013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Artificial intelligence (AI) holds promise for enhancing medical education, particularly in complex fields like cardiology. We assessed the ability of a large language model (LLM) to generate and evaluate educational material comparable to that created by human experts.</div></div><div><h3>Methods</h3><div>We trained an AI model on cardiology-specific content using 80 lectures from the St. Michael’s Hospital Virtual Echo Rounds. The AI generated 10 multiple-choice questions (MCQs), and experienced cardiologists crafted an additional 10 MCQs. Eleven postgraduate year 4-6 cardiology trainees answered all 20 questions and attempted to identify the source (AI or human) of each question. The AI also answered the same set of questions. We analyzed performance using the Wilcoxon signed-rank test and recognition ability.</div></div><div><h3>Results</h3><div>Trainees scored similarly on AI-generated and human-generated questions (median 8/10 vs 8/10; <em>P</em> > 0.05). Their ability to identify the source of questions did not exceed chance levels (median correct identifications: 10/20; <em>P</em> > 0.05). The AI achieved 95% accuracy on AI-generated questions and 100% on human-generated questions.</div></div><div><h3>Conclusions</h3><div>The AI-generated educational content was of comparable quality to that produced by human experts, and trainees could not reliably distinguish between the 2 sources. Our findings suggest that AI could significantly augment cardiology education by providing high-quality, scalable learning resources.</div></div>\",\"PeriodicalId\":36924,\"journal\":{\"name\":\"CJC Open\",\"volume\":\"7 6\",\"pages\":\"Pages 795-798\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CJC Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589790X25001295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CJC Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589790X25001295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

人工智能（AI）有望加强医学教育，特别是在心脏病学等复杂领域。我们评估了大型语言模型（LLM）生成和评估与人类专家创建的教育材料相当的教育材料的能力。方法利用St. Michael 's Hospital虚拟回声查房的80个讲座，对AI模型进行心脏病学特定内容的训练。人工智能生成了10个选择题（mcq），经验丰富的心脏病专家制作了另外10个mcq。11名4-6年级的心脏病学研究生回答了所有20个问题，并试图确定每个问题的来源（人工智能或人类）。人工智能也回答了同样的问题。我们使用Wilcoxon符号秩检验和识别能力来分析性能。结果受试者在人工智能和人工生成的问题上得分相似(中位数为8/10 vs 8/10；P比;0.05)。他们识别问题来源的能力没有超过机会水平(正确识别的中位数：10/20；P比;0.05)。人工智能在人工生成的问题上达到95%的准确率，在人工生成的问题上达到100%。结论人工智能生成的教育内容与人类专家生成的教育内容质量相当，受训者无法可靠地区分这两种来源。我们的研究结果表明，人工智能可以通过提供高质量、可扩展的学习资源来显著增强心脏病学教育。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Echoing Intelligence: Assessing the Capabilities of a Novel Generative AI With a Large Language Model in Cardiology Education

查看原文本刊更多论文

Echoing Intelligence: Assessing the Capabilities of a Novel Generative AI With a Large Language Model in Cardiology Education

Background

Artificial intelligence (AI) holds promise for enhancing medical education, particularly in complex fields like cardiology. We assessed the ability of a large language model (LLM) to generate and evaluate educational material comparable to that created by human experts.

Methods

We trained an AI model on cardiology-specific content using 80 lectures from the St. Michael’s Hospital Virtual Echo Rounds. The AI generated 10 multiple-choice questions (MCQs), and experienced cardiologists crafted an additional 10 MCQs. Eleven postgraduate year 4-6 cardiology trainees answered all 20 questions and attempted to identify the source (AI or human) of each question. The AI also answered the same set of questions. We analyzed performance using the Wilcoxon signed-rank test and recognition ability.

Results

Trainees scored similarly on AI-generated and human-generated questions (median 8/10 vs 8/10; P > 0.05). Their ability to identify the source of questions did not exceed chance levels (median correct identifications: 10/20; P > 0.05). The AI achieved 95% accuracy on AI-generated questions and 100% on human-generated questions.

Conclusions

The AI-generated educational content was of comparable quality to that produced by human experts, and trainees could not reliably distinguish between the 2 sources. Our findings suggest that AI could significantly augment cardiology education by providing high-quality, scalable learning resources.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊