基于 BERTopic 的中药处方用药规则数据挖掘算法研究

Medicine Advances Pub Date : 2023-11-27 DOI:10.1002/med4.39
Hongchen Li, Xinyi Lu, Yujia Wu, Jie Luo
{"title":"基于 BERTopic 的中药处方用药规则数据挖掘算法研究","authors":"Hongchen Li, Xinyi Lu, Yujia Wu, Jie Luo","doi":"10.1002/med4.39","DOIUrl":null,"url":null,"abstract":"A data mining algorithm is proposed based on BERTopic to provide new insights into the analysis of medication rules in Traditional Chinese Medicine (TCM) prescriptions.Using the BERTopic algorithm, collected TCM prescriptions for corneal diseases are converted to embeddings through a transformer based on the Bidirectional Encoder Representations from Transformers pre‐trained model. Then, Uniform Manifold Approximation and Projection is applied to perform dimensionality reduction in prescription embeddings. Subsequently, Hierarchical Density‐Based Spatial Clustering of Applications with Noise is used for clustering. Finally, class‐based term frequency–inverse document frequency is used to generate several main drug combinations from the clustered results.The highest frequency of drugs used included Buddleja officinalis, Bidens pilosa, Angelica sinensis, Eriocaulon buergerianum, and Raw Rehmannia glutinosa. The most frequent drug combinations were “Eriocaulon buergerianum, Raw Rehmannia glutinosa, Prunella vulgaris, Notopterygium incisum” “Lycii Fructus, Bidens pilosa, Buddleja officinalis” and “Kochiae Fructus, Cortex Dictamni.”The proposed data mining algorithm based on BERTopic demonstrated promising outcomes in the analysis of TCM prescription medication rules. This method exhibited simplicity and efficiency, thereby offering a novel avenue for analysis.","PeriodicalId":502918,"journal":{"name":"Medicine Advances","volume":"5 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on a data mining algorithm based on BERTopic for medication rules in Traditional Chinese Medicine prescriptions\",\"authors\":\"Hongchen Li, Xinyi Lu, Yujia Wu, Jie Luo\",\"doi\":\"10.1002/med4.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A data mining algorithm is proposed based on BERTopic to provide new insights into the analysis of medication rules in Traditional Chinese Medicine (TCM) prescriptions.Using the BERTopic algorithm, collected TCM prescriptions for corneal diseases are converted to embeddings through a transformer based on the Bidirectional Encoder Representations from Transformers pre‐trained model. Then, Uniform Manifold Approximation and Projection is applied to perform dimensionality reduction in prescription embeddings. Subsequently, Hierarchical Density‐Based Spatial Clustering of Applications with Noise is used for clustering. Finally, class‐based term frequency–inverse document frequency is used to generate several main drug combinations from the clustered results.The highest frequency of drugs used included Buddleja officinalis, Bidens pilosa, Angelica sinensis, Eriocaulon buergerianum, and Raw Rehmannia glutinosa. The most frequent drug combinations were “Eriocaulon buergerianum, Raw Rehmannia glutinosa, Prunella vulgaris, Notopterygium incisum” “Lycii Fructus, Bidens pilosa, Buddleja officinalis” and “Kochiae Fructus, Cortex Dictamni.”The proposed data mining algorithm based on BERTopic demonstrated promising outcomes in the analysis of TCM prescription medication rules. This method exhibited simplicity and efficiency, thereby offering a novel avenue for analysis.\",\"PeriodicalId\":502918,\"journal\":{\"name\":\"Medicine Advances\",\"volume\":\"5 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medicine Advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/med4.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medicine Advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/med4.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

利用 BERTopic 算法,收集到的角膜病中医处方通过基于转换器预训练模型的双向编码器表示的转换器转换为嵌入。然后,应用统一曲面逼近和投影技术对处方嵌入进行降维处理。然后,使用基于密度的分层空间聚类(Hierarchical Density-Based Spatial Clustering of Applications with Noise)进行聚类。使用频率最高的药物包括百部、白头翁、当归、桔梗和生地黄。最常见的药物组合为 "桔梗、生地黄、防风、白术""枸杞子、牛膝、巴戟天 "和 "鸡血藤、独活"。该方法简单高效,为分析提供了一条新途径。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Research on a data mining algorithm based on BERTopic for medication rules in Traditional Chinese Medicine prescriptions
A data mining algorithm is proposed based on BERTopic to provide new insights into the analysis of medication rules in Traditional Chinese Medicine (TCM) prescriptions.Using the BERTopic algorithm, collected TCM prescriptions for corneal diseases are converted to embeddings through a transformer based on the Bidirectional Encoder Representations from Transformers pre‐trained model. Then, Uniform Manifold Approximation and Projection is applied to perform dimensionality reduction in prescription embeddings. Subsequently, Hierarchical Density‐Based Spatial Clustering of Applications with Noise is used for clustering. Finally, class‐based term frequency–inverse document frequency is used to generate several main drug combinations from the clustered results.The highest frequency of drugs used included Buddleja officinalis, Bidens pilosa, Angelica sinensis, Eriocaulon buergerianum, and Raw Rehmannia glutinosa. The most frequent drug combinations were “Eriocaulon buergerianum, Raw Rehmannia glutinosa, Prunella vulgaris, Notopterygium incisum” “Lycii Fructus, Bidens pilosa, Buddleja officinalis” and “Kochiae Fructus, Cortex Dictamni.”The proposed data mining algorithm based on BERTopic demonstrated promising outcomes in the analysis of TCM prescription medication rules. This method exhibited simplicity and efficiency, thereby offering a novel avenue for analysis.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信