生成式人工智能平台在儿科牙科博士前检查中的比较准确性。

Pediatric dentistry Pub Date : 2025-03-15
Shahbaz Katebzadeh, Kaci Pickett-Nairne, Paloma Reyes Nguyen, Chaitanya Prakash Puranik
{"title":"生成式人工智能平台在儿科牙科博士前检查中的比较准确性。","authors":"Shahbaz Katebzadeh, Kaci Pickett-Nairne, Paloma Reyes Nguyen, Chaitanya Prakash Puranik","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p><b>Purpose:</b> To determine the comparative accuracy of seven generative artificial intelligence (GenAI) platforms in answering multiple-choice questions on a predoctoral pediatric dentistry examination. This study evaluated the impact of question type and GenAI training on accuracy. <b>Methods:</b> In this study, 100 multiple-choice questions were answered by seven GenAIs using a standard prompt. The study included five untrained GenAIs (Llama, Gemini, Copilot, ChatGPT3.5, and ChatGPT4) and two trained GenAIs (ChatGPT3.5 and ChatGPT4). The training of GenAIs was performed using evidence-based data. The questions were categorized as knowledge-based versus critical thinking on 10 subspecialty domains. The GenAIs were asked to select one correct answer from four choices, and only the first generated response was recorded. Data were subjected to statistical analysis (alpha equals 0.05), with a passing score of 75 percent. <b>Results:</b> Trained ChatGPT4 had the highest accuracy score (90 percent), while untrained Copilot had the lowest accuracy score (57 percent). Only three GenAIs received a passing score (trained ChatGPT3.5, untrained and trained ChatGPT4). The average performance of these three GenAIs (87 percent) was comparable to that of dental students (89 percent). There was no difference in the accuracy of GenAI in answering knowledge-based or critical-thinking questions. Similarly, sub-specialty domain types did not impact the accuracy of GenAI. <b>Conclusions:</b> Newer or trained models of generative artificial intelligence have higher accuracy compared to older or untrained models of GenAI. In the future, due to high accuracy, newer or trained models of GenAI can be used as adjuncts in predoctoral pediatric dental education.</p>","PeriodicalId":101357,"journal":{"name":"Pediatric dentistry","volume":"47 2","pages":"79-84"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative Accuracy of Generative Artificial Intelligence Platforms on Predoctoral Pediatric Dentistry Examination.\",\"authors\":\"Shahbaz Katebzadeh, Kaci Pickett-Nairne, Paloma Reyes Nguyen, Chaitanya Prakash Puranik\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Purpose:</b> To determine the comparative accuracy of seven generative artificial intelligence (GenAI) platforms in answering multiple-choice questions on a predoctoral pediatric dentistry examination. This study evaluated the impact of question type and GenAI training on accuracy. <b>Methods:</b> In this study, 100 multiple-choice questions were answered by seven GenAIs using a standard prompt. The study included five untrained GenAIs (Llama, Gemini, Copilot, ChatGPT3.5, and ChatGPT4) and two trained GenAIs (ChatGPT3.5 and ChatGPT4). The training of GenAIs was performed using evidence-based data. The questions were categorized as knowledge-based versus critical thinking on 10 subspecialty domains. The GenAIs were asked to select one correct answer from four choices, and only the first generated response was recorded. Data were subjected to statistical analysis (alpha equals 0.05), with a passing score of 75 percent. <b>Results:</b> Trained ChatGPT4 had the highest accuracy score (90 percent), while untrained Copilot had the lowest accuracy score (57 percent). Only three GenAIs received a passing score (trained ChatGPT3.5, untrained and trained ChatGPT4). The average performance of these three GenAIs (87 percent) was comparable to that of dental students (89 percent). There was no difference in the accuracy of GenAI in answering knowledge-based or critical-thinking questions. Similarly, sub-specialty domain types did not impact the accuracy of GenAI. <b>Conclusions:</b> Newer or trained models of generative artificial intelligence have higher accuracy compared to older or untrained models of GenAI. In the future, due to high accuracy, newer or trained models of GenAI can be used as adjuncts in predoctoral pediatric dental education.</p>\",\"PeriodicalId\":101357,\"journal\":{\"name\":\"Pediatric dentistry\",\"volume\":\"47 2\",\"pages\":\"79-84\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-03-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pediatric dentistry\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pediatric dentistry","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

目的:确定七个生成式人工智能(GenAI)平台在回答博士前儿科牙科考试中的多项选择题时的相对准确性。本研究评估了问题类型和GenAI训练对准确性的影响。方法:在本研究中,由7位GenAIs使用标准提示回答100个选择题。该研究包括5个未训练的GenAIs (Llama, Gemini, Copilot, ChatGPT3.5和ChatGPT4)和2个训练的GenAIs (ChatGPT3.5和ChatGPT4)。GenAIs的培训使用循证数据进行。这些问题在10个亚专业领域被分类为知识型思维和批判性思维。genai被要求从四个选项中选择一个正确答案,并且只记录第一个生成的答案。数据进行统计分析(alpha = 0.05),通过率为75%。结果:经过训练的ChatGPT4的准确率得分最高(90%),而未经训练的副驾驶的准确率得分最低(57%)。只有三个genai获得了及格分数(训练ChatGPT3.5,未训练和训练ChatGPT4)。这三名GenAIs的平均表现(87%)与牙科学生的平均表现(89%)相当。GenAI在回答基于知识或批判性思维的问题时的准确性没有差异。同样,子专业领域类型不影响GenAI的准确性。结论:较新的或经过训练的生成式人工智能模型比较旧的或未经训练的GenAI模型具有更高的准确性。在未来,由于准确性高,更新或训练的GenAI模型可以用作博士前儿科牙科教育的辅助工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative Accuracy of Generative Artificial Intelligence Platforms on Predoctoral Pediatric Dentistry Examination.

Purpose: To determine the comparative accuracy of seven generative artificial intelligence (GenAI) platforms in answering multiple-choice questions on a predoctoral pediatric dentistry examination. This study evaluated the impact of question type and GenAI training on accuracy. Methods: In this study, 100 multiple-choice questions were answered by seven GenAIs using a standard prompt. The study included five untrained GenAIs (Llama, Gemini, Copilot, ChatGPT3.5, and ChatGPT4) and two trained GenAIs (ChatGPT3.5 and ChatGPT4). The training of GenAIs was performed using evidence-based data. The questions were categorized as knowledge-based versus critical thinking on 10 subspecialty domains. The GenAIs were asked to select one correct answer from four choices, and only the first generated response was recorded. Data were subjected to statistical analysis (alpha equals 0.05), with a passing score of 75 percent. Results: Trained ChatGPT4 had the highest accuracy score (90 percent), while untrained Copilot had the lowest accuracy score (57 percent). Only three GenAIs received a passing score (trained ChatGPT3.5, untrained and trained ChatGPT4). The average performance of these three GenAIs (87 percent) was comparable to that of dental students (89 percent). There was no difference in the accuracy of GenAI in answering knowledge-based or critical-thinking questions. Similarly, sub-specialty domain types did not impact the accuracy of GenAI. Conclusions: Newer or trained models of generative artificial intelligence have higher accuracy compared to older or untrained models of GenAI. In the future, due to high accuracy, newer or trained models of GenAI can be used as adjuncts in predoctoral pediatric dental education.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信