A Clustering Algorithm of Four Character Medicine Effect Phrases in TCM Patents

Na Deng, Song Lin, Caiquan Xiong, Desheng Li
{"title":"A Clustering Algorithm of Four Character Medicine Effect Phrases in TCM Patents","authors":"Na Deng, Song Lin, Caiquan Xiong, Desheng Li","doi":"10.1109/ICEIEC.2018.8473529","DOIUrl":null,"url":null,"abstract":"In the era of big data, data analysis and data mining are important decision support tools. As a very critical step, the accuracy and comprehensiveness of patent retrieval directly affects the results of patent analysis and mining. Now almost all the mainstream patent retrieval systems work based on retrieval words. It will miss a lot of similar patents. In order to improve the recall rate of Chinese patent retrieval and implement semantic retrieval, utilizing word-building and part of speech combination characteristics of four character medicine effect phrases, this paper puts forward a method to calculate the similarity of four character medicine effect phrases and gives a K-centroid clustering algorithm of them. The experimental results show the effectiveness of the method.","PeriodicalId":344233,"journal":{"name":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIEC.2018.8473529","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In the era of big data, data analysis and data mining are important decision support tools. As a very critical step, the accuracy and comprehensiveness of patent retrieval directly affects the results of patent analysis and mining. Now almost all the mainstream patent retrieval systems work based on retrieval words. It will miss a lot of similar patents. In order to improve the recall rate of Chinese patent retrieval and implement semantic retrieval, utilizing word-building and part of speech combination characteristics of four character medicine effect phrases, this paper puts forward a method to calculate the similarity of four character medicine effect phrases and gives a K-centroid clustering algorithm of them. The experimental results show the effectiveness of the method.
中药专利中四字药效短语的聚类算法
在大数据时代,数据分析和数据挖掘是重要的决策支持工具。专利检索的准确性和全面性直接影响到专利分析和挖掘的结果,是非常关键的一步。目前主流的专利检索系统几乎都是基于检索词的。它将失去许多类似的专利。为了提高中文专利检索的查全率,实现语义检索,利用四字药效短语的造词和词性组合特征,提出了四字药效短语相似度的计算方法,并给出了四字药效短语相似度的k -质心聚类算法。实验结果表明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信