2023年12月用英语和韩语查询评估生成式人工智能回答医学节肢动物学学习目标的信息量、准确性和相关性：一项描述性研究。

IF 9.3 Q1 EDUCATION, SCIENTIFIC DISCIPLINES

Journal of Educational Evaluation for Health Professions Pub Date : 2023-01-01 Epub Date: 2023-12-28 DOI:10.3352/jeehp.2023.20.39

Hyunju Lee, Soobin Park

{"title":"2023年12月用英语和韩语查询评估生成式人工智能回答医学节肢动物学学习目标的信息量、准确性和相关性：一项描述性研究。","authors":"Hyunju Lee, Soobin Park","doi":"10.3352/jeehp.2023.20.39","DOIUrl":null,"url":null,"abstract":"Purpose: This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.Methods: From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.Results: Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.Conclusion: In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.","PeriodicalId":46098,"journal":{"name":"Journal of Educational Evaluation for Health Professions","volume":"20 ","pages":"39"},"PeriodicalIF":9.3000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11893187/pdf/","citationCount":"0","resultStr":"{\"title\":\"Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study\",\"authors\":\"Hyunju Lee, Soobin Park\",\"doi\":\"10.3352/jeehp.2023.20.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.Methods: From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.Results: Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.Conclusion: In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.\",\"PeriodicalId\":46098,\"journal\":{\"name\":\"Journal of Educational Evaluation for Health Professions\",\"volume\":\"20 \",\"pages\":\"39\"},\"PeriodicalIF\":9.3000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11893187/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Educational Evaluation for Health Professions\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3352/jeehp.2023.20.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/12/28 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Educational Evaluation for Health Professions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3352/jeehp.2023.20.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/28 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

摘要

目的：本研究旨在评估6种生成式人工智能（AIs）在韩国寄生虫学课堂上对医学节肢动物学学习目标的表现。我们通过韩语和英语的查询来考察人工智能的表现，以确定其在两种语言提示下的信息量、准确性和相关性：方法：2023 年 12 月 15 日至 17 日，6 个生成式人工智能（包括 Bard、Bing、Claude、Clova X、GPT-4 和 Wrtn）针对 7 个医学节肢动物学学习目标用英语和韩语进行了测试。Clova X 和 Wrtn 是韩国公司的平台。结果：结果：Bard 信息丰富，但在准确性和相关性方面排名第四。GPT-4 信息量大，在准确性和相关性方面排名第一。Clova X 信息量排名第四，但准确性和相关性排名第二。Bing 提供的信息量较少，准确性和相关性适中。Wrtn 的答案数据不足，准确性和相关性一般。Claude AI 提供了合理的信息，但准确性和相关性较低。英文版的回答在各方面都更胜一筹。Clova X 针对韩语进行了显著优化，在相关性方面遥遥领先：结论：在一项针对 6 个应用于医学节肢动物学的生成式人工智能的研究中，GPT-4 总体表现优异，而 Clova X（一个基于韩国的人工智能）在韩语查询中的相关性达到了 100%，是同类产品中最高的。在课堂上使用这些人工智能提高了作者的自我效能感和对该学科的兴趣，提供了与生成式人工智能互动提问和接收信息的积极体验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study

Purpose: This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.

Methods: From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.

Results: Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.

Conclusion: In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Educational Evaluation for Health Professions EDUCATION, SCIENTIFIC DISCIPLINES-

CiteScore

9.60

自引率

9.10%

发文量

审稿时长

5 weeks

期刊介绍： Journal of Educational Evaluation for Health Professions aims to provide readers the state-of-the art practical information on the educational evaluation for health professions so that to increase the quality of undergraduate, graduate, and continuing education. It is specialized in educational evaluation including adoption of measurement theory to medical health education, promotion of high stakes examination such as national licensing examinations, improvement of nationwide or international programs of education, computer-based testing, computerized adaptive testing, and medical health regulatory bodies. Its field comprises a variety of professions that address public medical health as following but not limited to: Care workers Dental hygienists Dental technicians Dentists Dietitians Emergency medical technicians Health educators Medical record technicians Medical technologists Midwives Nurses Nursing aides Occupational therapists Opticians Oriental medical doctors Oriental medicine dispensers Oriental pharmacists Pharmacists Physical therapists Physicians Prosthetists and Orthotists Radiological technologists Rehabilitation counselor Sanitary technicians Speech-language therapists.