Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, Cui Yang
{"title":"中医执业资格考试GPT与ERNIE:文化背景重要吗?","authors":"Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, Cui Yang","doi":"10.1089/jicm.2024.0902","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Purpose:</i></b> This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). <b><i>Materials and Methods:</i></b> We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. <b><i>Results:</i></b> ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. <b><i>Conclusion:</i></b> The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.</p>","PeriodicalId":29734,"journal":{"name":"Journal of Integrative and Complementary Medicine","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GPT Versus ERNIE for National Traditional Chinese Medicine Licensing Examination: Does Cultural Background Matter?\",\"authors\":\"Erfan Ghanad, Christel Weiß, Hui Gao, Christoph Reißfelder, Kamal Hummedah, Lei Han, Leihui Tong, Chengpeng Li, Cui Yang\",\"doi\":\"10.1089/jicm.2024.0902\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b><i>Purpose:</i></b> This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). <b><i>Materials and Methods:</i></b> We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. <b><i>Results:</i></b> ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. <b><i>Conclusion:</i></b> The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.</p>\",\"PeriodicalId\":29734,\"journal\":{\"name\":\"Journal of Integrative and Complementary Medicine\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Integrative and Complementary Medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1089/jicm.2024.0902\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"INTEGRATIVE & COMPLEMENTARY MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Integrative and Complementary Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/jicm.2024.0902","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INTEGRATIVE & COMPLEMENTARY MEDICINE","Score":null,"Total":0}
GPT Versus ERNIE for National Traditional Chinese Medicine Licensing Examination: Does Cultural Background Matter?
Purpose: This study evaluates the performance of large language models (LLMs) in the context of the Chinese National Traditional Chinese Medicine Licensing Examination (TCMLE). Materials and Methods: We compared the performances of different versions of Generative Pre-trained Transformer (GPT) and Enhanced Representation through Knowledge Integration (ERNIE) using historical TCMLE questions. Results: ERNIE-4.0 outperformed all other models with an accuracy of 81.7%, followed by ERNIE-3.5 (75.2%), GPT-4o (74.8%), and GPT-4 turbo (50.7%). For questions related to Western internal medicine, all models showed high accuracy above 86.7%. Conclusion: The study highlights the significance of cultural context in training data, influencing the performance of LLMs in specific medical examinations.