用韩语回答癌症相关问题的大型语言模型比较分析。

IF 2.6 4区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Hyun Chang, Jin-Woo Jung, Yongho Kim
{"title":"用韩语回答癌症相关问题的大型语言模型比较分析。","authors":"Hyun Chang, Jin-Woo Jung, Yongho Kim","doi":"10.3349/ymj.2024.0200","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Large language models (LLMs) have shown potential in medicine, transforming patient education, clinical decision support, and medical research. However, the effectiveness of LLMs in providing accurate medical information, particularly in non-English languages, remains underexplored. This study aimed to compare the quality of responses generated by ChatGPT and Naver's CLOVA X to cancer-related questions posed in Korean.</p><p><strong>Materials and methods: </strong>The study involved selecting cancer-related questions from the National Cancer Institute and Korean National Cancer Information Center websites. Responses were generated using ChatGPT and CLOVA X, and three oncologists assessed their quality using the Global Quality Score (GQS). The readability of the responses generated by ChatGPT and CLOVA X was calculated using KReaD, an artificial intelligence-based tool designed to objectively assess the complexity of Korean texts and reader comprehension.</p><p><strong>Results: </strong>The Wilcoxon test for the GQS score of answers using ChatGPT and CLOVA X showed that there is no statistically significant difference in quality between the two LLMs (<i>p</i>>0.05). The chi-square statistic for the variables \"Good rating\" and \"Poor rating\" showed no significant difference in the quality of responses between the two LLMs (<i>p</i>>0.05). KReaD scores were higher for CLOVA X than for ChatGPT (<i>p</i>=0.036). The categorical data analysis for the variables \"Easy to read\" and \"Hard to read\" revealed no significant difference (<i>p</i>>0.05).</p><p><strong>Conclusion: </strong>Both ChatGPT and CLOVA X answered Korean-language cancer-related questions with no significant difference in overall quality.</p>","PeriodicalId":23765,"journal":{"name":"Yonsei Medical Journal","volume":"66 7","pages":"405-411"},"PeriodicalIF":2.6000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206594/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparative Analysis of Large Language Models for Answering Cancer-Related Questions in Korean.\",\"authors\":\"Hyun Chang, Jin-Woo Jung, Yongho Kim\",\"doi\":\"10.3349/ymj.2024.0200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Large language models (LLMs) have shown potential in medicine, transforming patient education, clinical decision support, and medical research. However, the effectiveness of LLMs in providing accurate medical information, particularly in non-English languages, remains underexplored. This study aimed to compare the quality of responses generated by ChatGPT and Naver's CLOVA X to cancer-related questions posed in Korean.</p><p><strong>Materials and methods: </strong>The study involved selecting cancer-related questions from the National Cancer Institute and Korean National Cancer Information Center websites. Responses were generated using ChatGPT and CLOVA X, and three oncologists assessed their quality using the Global Quality Score (GQS). The readability of the responses generated by ChatGPT and CLOVA X was calculated using KReaD, an artificial intelligence-based tool designed to objectively assess the complexity of Korean texts and reader comprehension.</p><p><strong>Results: </strong>The Wilcoxon test for the GQS score of answers using ChatGPT and CLOVA X showed that there is no statistically significant difference in quality between the two LLMs (<i>p</i>>0.05). The chi-square statistic for the variables \\\"Good rating\\\" and \\\"Poor rating\\\" showed no significant difference in the quality of responses between the two LLMs (<i>p</i>>0.05). KReaD scores were higher for CLOVA X than for ChatGPT (<i>p</i>=0.036). The categorical data analysis for the variables \\\"Easy to read\\\" and \\\"Hard to read\\\" revealed no significant difference (<i>p</i>>0.05).</p><p><strong>Conclusion: </strong>Both ChatGPT and CLOVA X answered Korean-language cancer-related questions with no significant difference in overall quality.</p>\",\"PeriodicalId\":23765,\"journal\":{\"name\":\"Yonsei Medical Journal\",\"volume\":\"66 7\",\"pages\":\"405-411\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2025-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206594/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Yonsei Medical Journal\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3349/ymj.2024.0200\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Yonsei Medical Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3349/ymj.2024.0200","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

目的:大型语言模型(LLMs)在医学上显示出潜力,改变了患者教育、临床决策支持和医学研究。然而,法学硕士在提供准确的医学信息,特别是以非英语语言提供准确的医学信息方面的有效性仍未得到充分探索。这项研究旨在比较ChatGPT和Naver的CLOVA X对韩语提出的癌症相关问题的回答质量。材料和方法:该研究从国家癌症研究所和韩国国家癌症信息中心网站上选择癌症相关问题。使用ChatGPT和CLOVA X生成应答,三位肿瘤学家使用全球质量评分(GQS)评估其质量。ChatGPT和CLOVA X生成的回答的可读性使用KReaD计算,KReaD是一种基于人工智能的工具,旨在客观评估韩国语文本的复杂性和读者理解能力。结果:使用ChatGPT和CLOVA X对答案的GQS评分进行Wilcoxon检验,两种LLMs的质量差异无统计学意义(p < 0.05)。两种LLMs的“好评分”和“差评分”变量的卡方统计结果显示,两种LLMs的反应质量无显著差异(p < 0.05)。CLOVA X组的KReaD评分高于ChatGPT组(p=0.036)。“易读”和“难读”变量的分类数据分析显示差异无统计学意义(p < 0.05)。结论:ChatGPT和CLOVA X均能回答韩语癌症相关问题,总体质量无显著差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative Analysis of Large Language Models for Answering Cancer-Related Questions in Korean.

Purpose: Large language models (LLMs) have shown potential in medicine, transforming patient education, clinical decision support, and medical research. However, the effectiveness of LLMs in providing accurate medical information, particularly in non-English languages, remains underexplored. This study aimed to compare the quality of responses generated by ChatGPT and Naver's CLOVA X to cancer-related questions posed in Korean.

Materials and methods: The study involved selecting cancer-related questions from the National Cancer Institute and Korean National Cancer Information Center websites. Responses were generated using ChatGPT and CLOVA X, and three oncologists assessed their quality using the Global Quality Score (GQS). The readability of the responses generated by ChatGPT and CLOVA X was calculated using KReaD, an artificial intelligence-based tool designed to objectively assess the complexity of Korean texts and reader comprehension.

Results: The Wilcoxon test for the GQS score of answers using ChatGPT and CLOVA X showed that there is no statistically significant difference in quality between the two LLMs (p>0.05). The chi-square statistic for the variables "Good rating" and "Poor rating" showed no significant difference in the quality of responses between the two LLMs (p>0.05). KReaD scores were higher for CLOVA X than for ChatGPT (p=0.036). The categorical data analysis for the variables "Easy to read" and "Hard to read" revealed no significant difference (p>0.05).

Conclusion: Both ChatGPT and CLOVA X answered Korean-language cancer-related questions with no significant difference in overall quality.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Yonsei Medical Journal
Yonsei Medical Journal 医学-医学:内科
CiteScore
4.50
自引率
0.00%
发文量
167
审稿时长
3 months
期刊介绍: The goal of the Yonsei Medical Journal (YMJ) is to publish high quality manuscripts dedicated to clinical or basic research. Any authors affiliated with an accredited biomedical institution may submit manuscripts of original articles, review articles, case reports, brief communications, and letters to the Editor.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信