英语和日语 ChatGPT 对麻醉相关医疗问题回答的比较研究

Kazuo Ando , Masaki Sato , Shin Wakatsuki , Ryotaro Nagai , Kumiko Chino , Hinata Kai , Tomomi Sasaki , Rie Kato , Teresa Phuongtram Nguyen , Nan Guo , Pervez Sultan
{"title":"英语和日语 ChatGPT 对麻醉相关医疗问题回答的比较研究","authors":"Kazuo Ando ,&nbsp;Masaki Sato ,&nbsp;Shin Wakatsuki ,&nbsp;Ryotaro Nagai ,&nbsp;Kumiko Chino ,&nbsp;Hinata Kai ,&nbsp;Tomomi Sasaki ,&nbsp;Rie Kato ,&nbsp;Teresa Phuongtram Nguyen ,&nbsp;Nan Guo ,&nbsp;Pervez Sultan","doi":"10.1016/j.bjao.2024.100296","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>The expansion of artificial intelligence (AI) within large language models (LLMs) has the potential to streamline healthcare delivery. Despite the increased use of LLMs, disparities in their performance particularly in different languages, remain underexplored. This study examines the quality of ChatGPT responses in English and Japanese, specifically to questions related to anaesthesiology.</p></div><div><h3>Methods</h3><p>Anaesthesiologists proficient in both languages were recruited as experts in this study. Ten frequently asked questions in anaesthesia were selected and translated for evaluation. Three non-sequential responses from ChatGPT were assessed for content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics) by expert evaluators.</p></div><div><h3>Results</h3><p>Eight anaesthesiologists evaluated English and Japanese LLM responses. The overall quality for all questions combined was higher in English compared with Japanese responses. Content and communication quality were significantly higher in English compared with Japanese LLMs responses (both <em>P</em>&lt;0.001) in all three responses. Comprehensiveness, safety, and understanding were higher scores in English LLM responses. In all three responses, more than half of the evaluators marked overall English responses as better than Japanese responses.</p></div><div><h3>Conclusions</h3><p>English LLM responses to anaesthesia-related frequently asked questions were superior in quality to Japanese responses when assessed by bilingual anaesthesia experts in this report. This study highlights the potential for language-related disparities in healthcare information and the need to improve the quality of AI responses in underrepresented languages. Future studies are needed to explore these disparities in other commonly spoken languages and to compare the performance of different LLMs.</p></div>","PeriodicalId":72418,"journal":{"name":"BJA open","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772609624000406/pdfft?md5=17018b4a959c51babd6313efa948146d&pid=1-s2.0-S2772609624000406-main.pdf","citationCount":"0","resultStr":"{\"title\":\"A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions\",\"authors\":\"Kazuo Ando ,&nbsp;Masaki Sato ,&nbsp;Shin Wakatsuki ,&nbsp;Ryotaro Nagai ,&nbsp;Kumiko Chino ,&nbsp;Hinata Kai ,&nbsp;Tomomi Sasaki ,&nbsp;Rie Kato ,&nbsp;Teresa Phuongtram Nguyen ,&nbsp;Nan Guo ,&nbsp;Pervez Sultan\",\"doi\":\"10.1016/j.bjao.2024.100296\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><p>The expansion of artificial intelligence (AI) within large language models (LLMs) has the potential to streamline healthcare delivery. Despite the increased use of LLMs, disparities in their performance particularly in different languages, remain underexplored. This study examines the quality of ChatGPT responses in English and Japanese, specifically to questions related to anaesthesiology.</p></div><div><h3>Methods</h3><p>Anaesthesiologists proficient in both languages were recruited as experts in this study. Ten frequently asked questions in anaesthesia were selected and translated for evaluation. Three non-sequential responses from ChatGPT were assessed for content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics) by expert evaluators.</p></div><div><h3>Results</h3><p>Eight anaesthesiologists evaluated English and Japanese LLM responses. The overall quality for all questions combined was higher in English compared with Japanese responses. Content and communication quality were significantly higher in English compared with Japanese LLMs responses (both <em>P</em>&lt;0.001) in all three responses. Comprehensiveness, safety, and understanding were higher scores in English LLM responses. In all three responses, more than half of the evaluators marked overall English responses as better than Japanese responses.</p></div><div><h3>Conclusions</h3><p>English LLM responses to anaesthesia-related frequently asked questions were superior in quality to Japanese responses when assessed by bilingual anaesthesia experts in this report. This study highlights the potential for language-related disparities in healthcare information and the need to improve the quality of AI responses in underrepresented languages. Future studies are needed to explore these disparities in other commonly spoken languages and to compare the performance of different LLMs.</p></div>\",\"PeriodicalId\":72418,\"journal\":{\"name\":\"BJA open\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772609624000406/pdfft?md5=17018b4a959c51babd6313efa948146d&pid=1-s2.0-S2772609624000406-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BJA open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772609624000406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJA open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772609624000406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景人工智能(AI)在大型语言模型(LLM)中的扩展有可能简化医疗保健服务。尽管 LLM 的使用越来越多,但其性能差异,尤其是在不同语言中的性能差异仍未得到充分探索。本研究探讨了英语和日语 ChatGPT 回答的质量,特别是与麻醉学相关的问题。我们选择了 10 个麻醉学方面的常见问题并进行了翻译评估。专家评估员对 ChatGPT 的三个非连续回答进行了内容质量(准确性、全面性和安全性)和交流质量(理解力、共鸣/语气和道德)评估。与日语回答相比,英语回答所有问题的总体质量更高。在所有三个问题的回答中,与日语回答相比,英语回答的内容和交流质量明显更高(均为 P<0.001)。在全面性、安全性和理解力方面,英语语言学硕士的回答得分更高。在所有三个回答中,半数以上的评估者认为英语回答的整体质量优于日语回答。结论在本报告中,由双语麻醉专家评估的麻醉相关常见问题的英语 LLM 回答的质量优于日语回答。本研究强调了医疗保健信息中可能存在的与语言相关的差异,以及提高代表性不足语言的人工智能回答质量的必要性。未来的研究还需要探索其他常用语言中的这些差异,并比较不同 LLM 的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A comparative study of English and Japanese ChatGPT responses to anaesthesia-related medical questions

Background

The expansion of artificial intelligence (AI) within large language models (LLMs) has the potential to streamline healthcare delivery. Despite the increased use of LLMs, disparities in their performance particularly in different languages, remain underexplored. This study examines the quality of ChatGPT responses in English and Japanese, specifically to questions related to anaesthesiology.

Methods

Anaesthesiologists proficient in both languages were recruited as experts in this study. Ten frequently asked questions in anaesthesia were selected and translated for evaluation. Three non-sequential responses from ChatGPT were assessed for content quality (accuracy, comprehensiveness, and safety) and communication quality (understanding, empathy/tone, and ethics) by expert evaluators.

Results

Eight anaesthesiologists evaluated English and Japanese LLM responses. The overall quality for all questions combined was higher in English compared with Japanese responses. Content and communication quality were significantly higher in English compared with Japanese LLMs responses (both P<0.001) in all three responses. Comprehensiveness, safety, and understanding were higher scores in English LLM responses. In all three responses, more than half of the evaluators marked overall English responses as better than Japanese responses.

Conclusions

English LLM responses to anaesthesia-related frequently asked questions were superior in quality to Japanese responses when assessed by bilingual anaesthesia experts in this report. This study highlights the potential for language-related disparities in healthcare information and the need to improve the quality of AI responses in underrepresented languages. Future studies are needed to explore these disparities in other commonly spoken languages and to compare the performance of different LLMs.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
BJA open
BJA open Anesthesiology and Pain Medicine
CiteScore
0.60
自引率
0.00%
发文量
0
审稿时长
83 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信