TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method

Q3 Medicine
Zhou Chunfang , Gong Qingyue , Zhan Wendong , Zhu Jinyang , Luan Huidan
{"title":"TCMLCM: an intelligent question-answering model for traditional Chinese medicine lung cancer based on the KG2TRAG method","authors":"Zhou Chunfang ,&nbsp;Gong Qingyue ,&nbsp;Zhan Wendong ,&nbsp;Zhu Jinyang ,&nbsp;Luan Huidan","doi":"10.1016/j.dcmed.2025.03.011","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>To improve the accuracy and professionalism of question-answering (QA) model in traditional Chinese medicine (TCM) lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph (KG) to text-enhanced retrieval-augmented generation (KG2TRAG) method.</div></div><div><h3>Methods</h3><div>The TCM lung cancer model (TCMLCM) was constructed by fine-tuning ChatGLM2-6B on the specialized datasets Tianchi TCM, HuangDi, and ShenNong-TCM-Dataset, as well as a TCM lung cancer KG. The KG2TRAG method was applied to enhance the knowledge retrieval, which can convert KG triples into natural language text via ChatGPT-aided linearization, leveraging large language models (LLMs) for context-aware reasoning. For a comprehensive comparison, MedicalGPT, HuatuoGPT, and BenTsao were selected as the baseline models. Performance was evaluated using bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), accuracy, and the domain-specific TCM-LCEval metrics, with validation from TCM oncology experts assessing answer accuracy, professionalism, and usability.</div></div><div><h3>Results</h3><div>The TCMLCM model achieved the optimal performance across all metrics, including a BLEU score of 32.15%, ROUGE-L of 59.08%, and an accuracy rate of 79.68%. Notably, in the TCM-LCEval assessment specific to the field of TCM, its performance was 3% − 12% higher than that of the baseline model. Expert evaluations highlighted superior performance in accuracy and professionalism.</div></div><div><h3>Conclusion</h3><div>TCMLCM can provide an innovative solution for TCM lung cancer QA, demonstrating the feasibility of integrating structured KGs with LLMs. This work advances intelligent TCM healthcare tools and lays a foundation for future AI-driven applications in traditional medicine.</div></div>","PeriodicalId":33578,"journal":{"name":"Digital Chinese Medicine","volume":"8 1","pages":"Pages 36-45"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chinese Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589377725000291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

To improve the accuracy and professionalism of question-answering (QA) model in traditional Chinese medicine (TCM) lung cancer by integrating large language models with structured knowledge graphs using the knowledge graph (KG) to text-enhanced retrieval-augmented generation (KG2TRAG) method.

Methods

The TCM lung cancer model (TCMLCM) was constructed by fine-tuning ChatGLM2-6B on the specialized datasets Tianchi TCM, HuangDi, and ShenNong-TCM-Dataset, as well as a TCM lung cancer KG. The KG2TRAG method was applied to enhance the knowledge retrieval, which can convert KG triples into natural language text via ChatGPT-aided linearization, leveraging large language models (LLMs) for context-aware reasoning. For a comprehensive comparison, MedicalGPT, HuatuoGPT, and BenTsao were selected as the baseline models. Performance was evaluated using bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), accuracy, and the domain-specific TCM-LCEval metrics, with validation from TCM oncology experts assessing answer accuracy, professionalism, and usability.

Results

The TCMLCM model achieved the optimal performance across all metrics, including a BLEU score of 32.15%, ROUGE-L of 59.08%, and an accuracy rate of 79.68%. Notably, in the TCM-LCEval assessment specific to the field of TCM, its performance was 3% − 12% higher than that of the baseline model. Expert evaluations highlighted superior performance in accuracy and professionalism.

Conclusion

TCMLCM can provide an innovative solution for TCM lung cancer QA, demonstrating the feasibility of integrating structured KGs with LLMs. This work advances intelligent TCM healthcare tools and lays a foundation for future AI-driven applications in traditional medicine.
TCMLCM:基于KG2TRAG方法的中医肺癌智能问答模型
目的采用知识图(KG) -文本增强检索-增强生成(KG2TRAG)方法,将大型语言模型与结构化知识图相结合,提高中医肺癌问答(QA)模型的准确性和专业性。方法在天池中医、黄帝、神农中医专业数据集以及中医肺癌KG上,对ChatGLM2-6B进行微调,构建中医肺癌模型(TCM - cm)。应用KG2TRAG方法增强知识检索,通过chatgpt辅助线性化将KG三元组转换为自然语言文本,利用大型语言模型(llm)进行上下文感知推理。为了进行综合比较,我们选择MedicalGPT、HuatuoGPT和BenTsao作为基线模型。使用双语评估替代研究(BLEU)、面向回忆的注册评估替代研究(ROUGE)、准确性和特定领域的TCM- lceval指标来评估绩效,并由中医肿瘤学专家评估答案的准确性、专业性和可用性进行验证。结果TCMLCM模型在各指标上均达到最佳表现,BLEU评分为32.15%,ROUGE-L评分为59.08%,准确率为79.68%。值得注意的是,在针对中医领域的TCM- lceval评估中,其性能比基线模型高出3% - 12%。专家评价突出了优异的准确性和专业性。结论TCM可为中医肺癌QA提供创新解决方案,证明了结构化KGs与llm相结合的可行性。这项工作推进了智能中医保健工具,为未来人工智能在传统医学中的应用奠定了基础。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Chinese Medicine
Digital Chinese Medicine Medicine-Complementary and Alternative Medicine
CiteScore
1.80
自引率
0.00%
发文量
126
审稿时长
63 days
期刊介绍:
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信