Evan A. Patel , Pierce T. Herrmann , Lindsay Fleischer , Peter Filip , Stephanie Joe , Rijul S. Kshirsagar , Edward C. Kuan , Peter Papagiannopoulos , Zara M. Patel , Sanjeet Rangarajan , Pete S. Batra , Bobby A. Tajudeen
{"title":"人工智能生成的耳鼻喉科学习指南在耳鼻喉科教学中的比较分析","authors":"Evan A. Patel , Pierce T. Herrmann , Lindsay Fleischer , Peter Filip , Stephanie Joe , Rijul S. Kshirsagar , Edward C. Kuan , Peter Papagiannopoulos , Zara M. Patel , Sanjeet Rangarajan , Pete S. Batra , Bobby A. Tajudeen","doi":"10.1016/j.amjoto.2025.104693","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><div>Resident physicians training in otolaryngology frequently utilize dense traditional textbooks like “Cummings: Otolaryngology Head and Neck Surgery,” widely regarded as the gold standard for educational content in the field. However, integrating artificial intelligence (AI) into educational methods offers potential enhancements to traditional methods of trainee learning. This study evaluates the accuracy, relevance, and clarity of AI-generated study guides for otolaryngology residents and their efficacy in graduate level education.</div></div><div><h3>Methods</h3><div>Study guides for four rhinology chapters of “Cummings Otolaryngology Head and Neck Surgery” were generated using ChatGPT-4 by a non-expert in otolaryngology to ensure replicability. Multiple board-certified Rhinologists evaluated the study guides with a structured assessment form. The guides were rated on accuracy, relevancy, and clarity using a 4-point scale. The item-level content validity index (I-CVI) was calculated for each parameter.</div></div><div><h3>Results</h3><div>The mean scores for accuracy, relevancy, and clarity across all chapters were 3.45 ± 0.19, 3.64 ± 0.17, and 3.36 ± 0.08, respectively. I-CVI scores for accuracy, relevancy, and clarity ranged from 0.8 to 1.0, signifying that the content was valid. Reviewers praised the comprehensive nature and clear formatting, although they suggested incorporating more detailed explanations and visual aids.</div></div><div><h3>Discussion</h3><div>The findings demonstrate the potential of Large Language Models (LLMs) in generating high-quality content. AI-generated resources can reduce the burden on educators and provide tailored resources for residents. Future research is necessary to explore refined AI models and multimodal inputs to enhance educational outcomes. LLM's, such as OpenAI's GPT-4, have revolutionized the opportunity for personalized learning experiences for graduate level trainees.</div></div><div><h3>Level of evidence</h3><div>Level 5</div></div>","PeriodicalId":7591,"journal":{"name":"American Journal of Otolaryngology","volume":"46 5","pages":"Article 104693"},"PeriodicalIF":1.7000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative analysis of AI-generated study guides in otolaryngology education\",\"authors\":\"Evan A. Patel , Pierce T. Herrmann , Lindsay Fleischer , Peter Filip , Stephanie Joe , Rijul S. Kshirsagar , Edward C. Kuan , Peter Papagiannopoulos , Zara M. Patel , Sanjeet Rangarajan , Pete S. Batra , Bobby A. Tajudeen\",\"doi\":\"10.1016/j.amjoto.2025.104693\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><div>Resident physicians training in otolaryngology frequently utilize dense traditional textbooks like “Cummings: Otolaryngology Head and Neck Surgery,” widely regarded as the gold standard for educational content in the field. However, integrating artificial intelligence (AI) into educational methods offers potential enhancements to traditional methods of trainee learning. This study evaluates the accuracy, relevance, and clarity of AI-generated study guides for otolaryngology residents and their efficacy in graduate level education.</div></div><div><h3>Methods</h3><div>Study guides for four rhinology chapters of “Cummings Otolaryngology Head and Neck Surgery” were generated using ChatGPT-4 by a non-expert in otolaryngology to ensure replicability. Multiple board-certified Rhinologists evaluated the study guides with a structured assessment form. The guides were rated on accuracy, relevancy, and clarity using a 4-point scale. The item-level content validity index (I-CVI) was calculated for each parameter.</div></div><div><h3>Results</h3><div>The mean scores for accuracy, relevancy, and clarity across all chapters were 3.45 ± 0.19, 3.64 ± 0.17, and 3.36 ± 0.08, respectively. I-CVI scores for accuracy, relevancy, and clarity ranged from 0.8 to 1.0, signifying that the content was valid. Reviewers praised the comprehensive nature and clear formatting, although they suggested incorporating more detailed explanations and visual aids.</div></div><div><h3>Discussion</h3><div>The findings demonstrate the potential of Large Language Models (LLMs) in generating high-quality content. AI-generated resources can reduce the burden on educators and provide tailored resources for residents. Future research is necessary to explore refined AI models and multimodal inputs to enhance educational outcomes. LLM's, such as OpenAI's GPT-4, have revolutionized the opportunity for personalized learning experiences for graduate level trainees.</div></div><div><h3>Level of evidence</h3><div>Level 5</div></div>\",\"PeriodicalId\":7591,\"journal\":{\"name\":\"American Journal of Otolaryngology\",\"volume\":\"46 5\",\"pages\":\"Article 104693\"},\"PeriodicalIF\":1.7000,\"publicationDate\":\"2025-06-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American Journal of Otolaryngology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0196070925000961\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OTORHINOLARYNGOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Otolaryngology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0196070925000961","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
在耳鼻喉科培训的住院医师经常使用密集的传统教科书,如“卡明斯:耳鼻喉头颈外科”,被广泛认为是该领域教育内容的黄金标准。然而,将人工智能(AI)整合到教育方法中,为传统的培训生学习方法提供了潜在的增强。本研究评估人工智能生成的耳鼻喉科住院医师学习指南的准确性、相关性和清晰度,以及它们在研究生水平教育中的有效性。方法由非耳鼻喉科专家使用ChatGPT-4生成《Cummings Otolaryngology Head and Neck Surgery》四个鼻科章节的学习指南,以确保可重复性。多名委员会认证的鼻医生用结构化的评估表格评估研究指南。使用4分制对指南的准确性、相关性和清晰度进行评分。计算每个参数的项目级内容效度指数(I-CVI)。结果各章节的准确性、相关性和清晰度平均得分分别为3.45±0.19、3.64±0.17和3.36±0.08。准确性、相关性和清晰度的I-CVI评分范围从0.8到1.0,表明内容是有效的。审稿人称赞了这本书的全面性质和清晰的格式,尽管他们建议加入更详细的解释和视觉辅助。这些发现证明了大型语言模型(llm)在生成高质量内容方面的潜力。人工智能生成的资源可以减轻教育工作者的负担,并为居民提供量身定制的资源。未来的研究需要探索改进的人工智能模型和多模态输入,以提高教育成果。法学硕士,如OpenAI的GPT-4,已经彻底改变了研究生水平学员个性化学习体验的机会。证据等级:5级
Comparative analysis of AI-generated study guides in otolaryngology education
Introduction
Resident physicians training in otolaryngology frequently utilize dense traditional textbooks like “Cummings: Otolaryngology Head and Neck Surgery,” widely regarded as the gold standard for educational content in the field. However, integrating artificial intelligence (AI) into educational methods offers potential enhancements to traditional methods of trainee learning. This study evaluates the accuracy, relevance, and clarity of AI-generated study guides for otolaryngology residents and their efficacy in graduate level education.
Methods
Study guides for four rhinology chapters of “Cummings Otolaryngology Head and Neck Surgery” were generated using ChatGPT-4 by a non-expert in otolaryngology to ensure replicability. Multiple board-certified Rhinologists evaluated the study guides with a structured assessment form. The guides were rated on accuracy, relevancy, and clarity using a 4-point scale. The item-level content validity index (I-CVI) was calculated for each parameter.
Results
The mean scores for accuracy, relevancy, and clarity across all chapters were 3.45 ± 0.19, 3.64 ± 0.17, and 3.36 ± 0.08, respectively. I-CVI scores for accuracy, relevancy, and clarity ranged from 0.8 to 1.0, signifying that the content was valid. Reviewers praised the comprehensive nature and clear formatting, although they suggested incorporating more detailed explanations and visual aids.
Discussion
The findings demonstrate the potential of Large Language Models (LLMs) in generating high-quality content. AI-generated resources can reduce the burden on educators and provide tailored resources for residents. Future research is necessary to explore refined AI models and multimodal inputs to enhance educational outcomes. LLM's, such as OpenAI's GPT-4, have revolutionized the opportunity for personalized learning experiences for graduate level trainees.
期刊介绍:
Be fully informed about developments in otology, neurotology, audiology, rhinology, allergy, laryngology, speech science, bronchoesophagology, facial plastic surgery, and head and neck surgery. Featured sections include original contributions, grand rounds, current reviews, case reports and socioeconomics.