{"title":"基于图卷积网络的《伤寒论》异构图构建及节点表示学习方法","authors":"Junfeng YAN , Zhihua WEN , Beiji ZOU (Professor)","doi":"10.1016/j.dcmed.2022.12.007","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>To construct symptom-formula-herb heterogeneous graphs structured <em>Treatise on Febrile Diseases</em> (<em>Shang Han Lun</em>,《伤寒论》) dataset and explore an optimal learning method represented with node attributes based on graph convolutional network (GCN).</p></div><div><h3>Methods</h3><p>Clauses that contain symptoms, formulas, and herbs were abstracted from <em>Treatise on Febrile Diseases</em> to construct symptom-formula-herb heterogeneous graphs, which were used to propose a node representation learning method based on GCN − the Traditional Chinese Medicine Graph Convolution Network (TCM-GCN). The symptom-formula, symptom-herb, and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes, and thus acquiring the nodes’ sum-aggregations of symptoms, formulas, and herbs to lay a foundation for the downstream tasks of the prediction models.</p></div><div><h3>Results</h3><p>Comparisons among the node representations with multi-hot encoding, non-fusion encoding, and fusion encoding showed that the Precision@10, Recall@10, and F1-score@10 of the fusion encoding were 9.77%, 6.65%, and 8.30%, respectively, higher than those of the non-fusion encoding in the prediction studies of the model.</p></div><div><h3>Conclusion</h3><p>Node representations by fusion encoding achieved comparatively ideal results, indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured <em>Treatise on Febrile Diseases</em> dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.</p></div>","PeriodicalId":33578,"journal":{"name":"Digital Chinese Medicine","volume":"5 4","pages":"Pages 419-428"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2589377722000775/pdfft?md5=8ecd78dc51ce1810a92a9f622e1dd362&pid=1-s2.0-S2589377722000775-main.pdf","citationCount":"1","resultStr":"{\"title\":\"Heterogeneous graph construction and node representation learning method of Treatise on Febrile Diseases based on graph convolutional network\",\"authors\":\"Junfeng YAN , Zhihua WEN , Beiji ZOU (Professor)\",\"doi\":\"10.1016/j.dcmed.2022.12.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective</h3><p>To construct symptom-formula-herb heterogeneous graphs structured <em>Treatise on Febrile Diseases</em> (<em>Shang Han Lun</em>,《伤寒论》) dataset and explore an optimal learning method represented with node attributes based on graph convolutional network (GCN).</p></div><div><h3>Methods</h3><p>Clauses that contain symptoms, formulas, and herbs were abstracted from <em>Treatise on Febrile Diseases</em> to construct symptom-formula-herb heterogeneous graphs, which were used to propose a node representation learning method based on GCN − the Traditional Chinese Medicine Graph Convolution Network (TCM-GCN). The symptom-formula, symptom-herb, and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes, and thus acquiring the nodes’ sum-aggregations of symptoms, formulas, and herbs to lay a foundation for the downstream tasks of the prediction models.</p></div><div><h3>Results</h3><p>Comparisons among the node representations with multi-hot encoding, non-fusion encoding, and fusion encoding showed that the Precision@10, Recall@10, and F1-score@10 of the fusion encoding were 9.77%, 6.65%, and 8.30%, respectively, higher than those of the non-fusion encoding in the prediction studies of the model.</p></div><div><h3>Conclusion</h3><p>Node representations by fusion encoding achieved comparatively ideal results, indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured <em>Treatise on Febrile Diseases</em> dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.</p></div>\",\"PeriodicalId\":33578,\"journal\":{\"name\":\"Digital Chinese Medicine\",\"volume\":\"5 4\",\"pages\":\"Pages 419-428\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2589377722000775/pdfft?md5=8ecd78dc51ce1810a92a9f622e1dd362&pid=1-s2.0-S2589377722000775-main.pdf\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Chinese Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2589377722000775\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chinese Medicine","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2589377722000775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 1
摘要
目的构建异质图结构的《伤寒论》数据集,探索一种基于图卷积网络(GCN)的节点属性表示的最优学习方法。方法从《伤寒论》中提取包含症状、方剂和草药的子句,构建症状-方剂-草药异质图,并利用该异质图提出一种基于中医图卷积网络(Traditional Chinese Medicine Graph Convolution Network, TCM-GCN)的节点表示学习方法。通过TCM-GCN对症状-公式、症状-草药、配方-草药异构图进行处理,实现高阶传播消息传递和邻居聚合,获得新的节点表示属性,从而获得节点对症状、公式、草药的和聚合,为预测模型的下游任务奠定基础。结果对多热编码、非融合编码和融合编码的节点表示进行比较,在模型预测研究中,融合编码的节点表示的Precision@10、Recall@10和F1-score@10分别比非融合编码的节点表示高9.77%、6.65%和8.30%。结论融合编码的节点表示取得了较为理想的结果,表明TCM-GCN能够有效地实现异构图结构《温病论》数据集的节点级表示,能够提升诊断模型下游任务的性能。
Heterogeneous graph construction and node representation learning method of Treatise on Febrile Diseases based on graph convolutional network
Objective
To construct symptom-formula-herb heterogeneous graphs structured Treatise on Febrile Diseases (Shang Han Lun,《伤寒论》) dataset and explore an optimal learning method represented with node attributes based on graph convolutional network (GCN).
Methods
Clauses that contain symptoms, formulas, and herbs were abstracted from Treatise on Febrile Diseases to construct symptom-formula-herb heterogeneous graphs, which were used to propose a node representation learning method based on GCN − the Traditional Chinese Medicine Graph Convolution Network (TCM-GCN). The symptom-formula, symptom-herb, and formula-herb heterogeneous graphs were processed with the TCM-GCN to realize high-order propagating message passing and neighbor aggregation to obtain new node representation attributes, and thus acquiring the nodes’ sum-aggregations of symptoms, formulas, and herbs to lay a foundation for the downstream tasks of the prediction models.
Results
Comparisons among the node representations with multi-hot encoding, non-fusion encoding, and fusion encoding showed that the Precision@10, Recall@10, and F1-score@10 of the fusion encoding were 9.77%, 6.65%, and 8.30%, respectively, higher than those of the non-fusion encoding in the prediction studies of the model.
Conclusion
Node representations by fusion encoding achieved comparatively ideal results, indicating the TCM-GCN is effective in realizing node-level representations of heterogeneous graph structured Treatise on Febrile Diseases dataset and is able to elevate the performance of the downstream tasks of the diagnosis model.