{"title":"骨肉瘤知识图谱问答系统:基于深度学习的知识图谱与大语言模型融合","authors":"Lulu Zhang , Weisong Zhao , Zhiwei Cheng , Yafei Jiang , Kai Tian , Jia Shi , Zhenyu Jiang , Yingqi Hua","doi":"10.1016/j.imed.2024.12.001","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field, we aimed to construct the first comprehensive osteosarcoma knowledge graph (OSKG) using the PubMed database.</div></div><div><h3>Methods</h3><div>A systematic search of PubMed (2003–2023) using the keyword “osteosarcoma” yielded 25,415 abstracts. Leveraging BioBERT, pretrained on biomedical corpora and fine-tuned with osteosarcoma-specific manual annotations, we identified 16 entity types and 17 biological relationships. The extracted elements were synthesized to create the OSKG, resulting in a deep learning-based knowledge base to explore osteosarcoma pathogenesis and molecular mechanisms. We then developed a specialized question-answering system (knowledge graph question answering (KGQA)) powered by ChatGLM3. This system employs advanced natural language processing and incorporates the OSKG to ensure optimal response quality and accuracy.</div></div><div><h3>Results</h3><div>The pretrained BioBERT averaged > 92 % accuracy in entity and relationship training. Evaluation using 100 pairs of gold-standard quizzes showed that the final quiz system outperformed other large language models in accuracy and robustness.</div></div><div><h3>Conclusion</h3><div>The system is designed to provide accurate disease-related queries and answers, effectively facilitating knowledge acquisition and reasoning in medical research and clinical practice. This project offers a robust tool for osteosarcoma research and promotes the deep integration of knowledge graphs and artificial intelligence technologies in the medical field.</div></div>","PeriodicalId":73400,"journal":{"name":"Intelligent medicine","volume":"5 2","pages":"Pages 99-110"},"PeriodicalIF":6.9000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion\",\"authors\":\"Lulu Zhang , Weisong Zhao , Zhiwei Cheng , Yafei Jiang , Kai Tian , Jia Shi , Zhenyu Jiang , Yingqi Hua\",\"doi\":\"10.1016/j.imed.2024.12.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><div>Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field, we aimed to construct the first comprehensive osteosarcoma knowledge graph (OSKG) using the PubMed database.</div></div><div><h3>Methods</h3><div>A systematic search of PubMed (2003–2023) using the keyword “osteosarcoma” yielded 25,415 abstracts. Leveraging BioBERT, pretrained on biomedical corpora and fine-tuned with osteosarcoma-specific manual annotations, we identified 16 entity types and 17 biological relationships. The extracted elements were synthesized to create the OSKG, resulting in a deep learning-based knowledge base to explore osteosarcoma pathogenesis and molecular mechanisms. We then developed a specialized question-answering system (knowledge graph question answering (KGQA)) powered by ChatGLM3. This system employs advanced natural language processing and incorporates the OSKG to ensure optimal response quality and accuracy.</div></div><div><h3>Results</h3><div>The pretrained BioBERT averaged > 92 % accuracy in entity and relationship training. Evaluation using 100 pairs of gold-standard quizzes showed that the final quiz system outperformed other large language models in accuracy and robustness.</div></div><div><h3>Conclusion</h3><div>The system is designed to provide accurate disease-related queries and answers, effectively facilitating knowledge acquisition and reasoning in medical research and clinical practice. This project offers a robust tool for osteosarcoma research and promotes the deep integration of knowledge graphs and artificial intelligence technologies in the medical field.</div></div>\",\"PeriodicalId\":73400,\"journal\":{\"name\":\"Intelligent medicine\",\"volume\":\"5 2\",\"pages\":\"Pages 99-110\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Intelligent medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2667102625000269\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667102625000269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Osteosarcoma knowledge graph question answering system: deep learning-based knowledge graph and large language model fusion
Objective
Osteosarcoma is a prevalent primary malignant bone tumor in children and adolescents, accounting for approximately 5 % of childhood malignancies. Because of its rarity and biological complexity, treatment breakthroughs for osteosarcoma have been limited. To advance research in this field, we aimed to construct the first comprehensive osteosarcoma knowledge graph (OSKG) using the PubMed database.
Methods
A systematic search of PubMed (2003–2023) using the keyword “osteosarcoma” yielded 25,415 abstracts. Leveraging BioBERT, pretrained on biomedical corpora and fine-tuned with osteosarcoma-specific manual annotations, we identified 16 entity types and 17 biological relationships. The extracted elements were synthesized to create the OSKG, resulting in a deep learning-based knowledge base to explore osteosarcoma pathogenesis and molecular mechanisms. We then developed a specialized question-answering system (knowledge graph question answering (KGQA)) powered by ChatGLM3. This system employs advanced natural language processing and incorporates the OSKG to ensure optimal response quality and accuracy.
Results
The pretrained BioBERT averaged > 92 % accuracy in entity and relationship training. Evaluation using 100 pairs of gold-standard quizzes showed that the final quiz system outperformed other large language models in accuracy and robustness.
Conclusion
The system is designed to provide accurate disease-related queries and answers, effectively facilitating knowledge acquisition and reasoning in medical research and clinical practice. This project offers a robust tool for osteosarcoma research and promotes the deep integration of knowledge graphs and artificial intelligence technologies in the medical field.