Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, Cui Yang
{"title":"GPT Versus ERNIE for National Traditional Chinese Medicine Licensing Examination: Does Cultural Background Matter?","authors":"Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, Cui Yang","doi":"10.1089/jicm.2024.0902","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Purpose:</i></b> This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). <b><i>Materials and Methods:</i></b> We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. <b><i>Results:</i></b> ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. <b><i>Conclusion:</i></b> The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.</p>","PeriodicalId":29734,"journal":{"name":"Journal of Integrative and Complementary Medicine","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Integrative and Complementary Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/jicm.2024.0902","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INTEGRATIVE & COMPLEMENTARY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). Materials and Methods: We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. Results: ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. Conclusion: The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.