基于组合的多信息源语义相似度度量

Hoa A. Nguyen, H. Al-Mubaid
{"title":"基于组合的多信息源语义相似度度量","authors":"Hoa A. Nguyen, H. Al-Mubaid","doi":"10.1109/IRI.2006.252484","DOIUrl":null,"url":null,"abstract":"The semantic similarity techniques are interested in determining how much two concepts, or terms, are similar according to a given ontology. This paper proposes a method for measuring semantic similarity/distance between terms. The measure combines strengths and complements weaknesses of existing measures that use ontology as primary source. The proposed measure uses a new feature of common specificity (CSpec) besides the path length feature. The CSpec feature is derived from (1) information content of concepts, and (2) information content of the ontology given a corpus. We evaluated the proposed measure with benchmark test set of term pairs scored for similarity by human experts. The experimental results demonstrated that our similarity measure is effective and outperforms the existing measures. The proposed semantic similarity measure gives the best correlation (0.874) with human scores in the benchmark test set compared to the existing measures","PeriodicalId":402255,"journal":{"name":"2006 IEEE International Conference on Information Reuse & Integration","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"A Combination-based Semantic Similarity Measure using Multiple Information Sources\",\"authors\":\"Hoa A. Nguyen, H. Al-Mubaid\",\"doi\":\"10.1109/IRI.2006.252484\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The semantic similarity techniques are interested in determining how much two concepts, or terms, are similar according to a given ontology. This paper proposes a method for measuring semantic similarity/distance between terms. The measure combines strengths and complements weaknesses of existing measures that use ontology as primary source. The proposed measure uses a new feature of common specificity (CSpec) besides the path length feature. The CSpec feature is derived from (1) information content of concepts, and (2) information content of the ontology given a corpus. We evaluated the proposed measure with benchmark test set of term pairs scored for similarity by human experts. The experimental results demonstrated that our similarity measure is effective and outperforms the existing measures. The proposed semantic similarity measure gives the best correlation (0.874) with human scores in the benchmark test set compared to the existing measures\",\"PeriodicalId\":402255,\"journal\":{\"name\":\"2006 IEEE International Conference on Information Reuse & Integration\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE International Conference on Information Reuse & Integration\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRI.2006.252484\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE International Conference on Information Reuse & Integration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI.2006.252484","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

语义相似技术感兴趣的是根据给定的本体确定两个概念或术语的相似程度。本文提出了一种测量词间语义相似度/距离的方法。该度量结合了使用本体作为主要来源的现有度量的优点并补充了缺点。除了路径长度特征外,该方法还使用了一种新的公共特异性特征(CSpec)。CSpec特性来源于(1)概念的信息内容,(2)给定语料库的本体的信息内容。我们使用人类专家对相似度评分的术语对的基准测试集来评估所提出的度量。实验结果表明,我们的相似度度量是有效的,并且优于现有的度量。与现有度量相比,所提出的语义相似性度量与基准测试集中的人类分数具有最佳相关性(0.874)
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Combination-based Semantic Similarity Measure using Multiple Information Sources
The semantic similarity techniques are interested in determining how much two concepts, or terms, are similar according to a given ontology. This paper proposes a method for measuring semantic similarity/distance between terms. The measure combines strengths and complements weaknesses of existing measures that use ontology as primary source. The proposed measure uses a new feature of common specificity (CSpec) besides the path length feature. The CSpec feature is derived from (1) information content of concepts, and (2) information content of the ontology given a corpus. We evaluated the proposed measure with benchmark test set of term pairs scored for similarity by human experts. The experimental results demonstrated that our similarity measure is effective and outperforms the existing measures. The proposed semantic similarity measure gives the best correlation (0.874) with human scores in the benchmark test set compared to the existing measures
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信