Hong Zhou, Hong-Lin Wang, Yu-Yu Duan, Zi-Neng Yan, Rui Luo, Xiang-Xin Lv, Yi Xie, Jia-Yao Zhang, Jia-Ming Yang, Ming-di Xue, Ying Fang, Lin Lu, Peng-Ran Liu, Zhe-Wei Ye
{"title":"增强骨科知识评估:专业生成语言模型优化的性能。","authors":"Hong Zhou, Hong-Lin Wang, Yu-Yu Duan, Zi-Neng Yan, Rui Luo, Xiang-Xin Lv, Yi Xie, Jia-Yao Zhang, Jia-Ming Yang, Ming-di Xue, Ying Fang, Lin Lu, Peng-Ran Liu, Zhe-Wei Ye","doi":"10.1007/s11596-024-2929-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.</p><p><strong>Methods: </strong>This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.</p><p><strong>Results: </strong>Compared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.</p><p><strong>Conclusion: </strong>The optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.</p>","PeriodicalId":10820,"journal":{"name":"Current Medical Science","volume":" ","pages":"1001-1005"},"PeriodicalIF":2.0000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization.\",\"authors\":\"Hong Zhou, Hong-Lin Wang, Yu-Yu Duan, Zi-Neng Yan, Rui Luo, Xiang-Xin Lv, Yi Xie, Jia-Yao Zhang, Jia-Ming Yang, Ming-di Xue, Ying Fang, Lin Lu, Peng-Ran Liu, Zhe-Wei Ye\",\"doi\":\"10.1007/s11596-024-2929-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.</p><p><strong>Methods: </strong>This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.</p><p><strong>Results: </strong>Compared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.</p><p><strong>Conclusion: </strong>The optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.</p>\",\"PeriodicalId\":10820,\"journal\":{\"name\":\"Current Medical Science\",\"volume\":\" \",\"pages\":\"1001-1005\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current Medical Science\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s11596-024-2929-4\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/5 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"MEDICINE, RESEARCH & EXPERIMENTAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Medical Science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11596-024-2929-4","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/5 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
Enhancing Orthopedic Knowledge Assessments: The Performance of Specialized Generative Language Model Optimization.
Objective: This study aimed to evaluate and compare the effectiveness of knowledge base-optimized and unoptimized large language models (LLMs) in the field of orthopedics to explore optimization strategies for the application of LLMs in specific fields.
Methods: This research constructed a specialized knowledge base using clinical guidelines from the American Academy of Orthopaedic Surgeons (AAOS) and authoritative orthopedic publications. A total of 30 orthopedic-related questions covering aspects such as anatomical knowledge, disease diagnosis, fracture classification, treatment options, and surgical techniques were input into both the knowledge base-optimized and unoptimized versions of the GPT-4, ChatGLM, and Spark LLM, with their generated responses recorded. The overall quality, accuracy, and comprehensiveness of these responses were evaluated by 3 experienced orthopedic surgeons.
Results: Compared with their unoptimized LLMs, the optimized version of GPT-4 showed improvements of 15.3% in overall quality, 12.5% in accuracy, and 12.8% in comprehensiveness; ChatGLM showed improvements of 24.8%, 16.1%, and 19.6%, respectively; and Spark LLM showed improvements of 6.5%, 14.5%, and 24.7%, respectively.
Conclusion: The optimization of knowledge bases significantly enhances the quality, accuracy, and comprehensiveness of the responses provided by the 3 models in the orthopedic field. Therefore, knowledge base optimization is an effective method for improving the performance of LLMs in specific fields.
期刊介绍:
Current Medical Science provides a forum for peer-reviewed papers in the medical sciences, to promote academic exchange between Chinese researchers and doctors and their foreign counterparts. The journal covers the subjects of biomedicine such as physiology, biochemistry, molecular biology, pharmacology, pathology and pathophysiology, etc., and clinical research, such as surgery, internal medicine, obstetrics and gynecology, pediatrics and otorhinolaryngology etc. The articles appearing in Current Medical Science are mainly in English, with a very small number of its papers in German, to pay tribute to its German founder. This journal is the only medical periodical in Western languages sponsored by an educational institution located in the central part of China.