Modified Path Measure to Assess Sentence Similarity

2018 Conference on Information and Communication Technology (CICT) Pub Date : 2018-10-01 DOI:10.1109/INFOCOMTECH.2018.8722419

M. K. Prasad, Poonam Sharma

{"title":"Modified Path Measure to Assess Sentence Similarity","authors":"M. K. Prasad, Poonam Sharma","doi":"10.1109/INFOCOMTECH.2018.8722419","DOIUrl":null,"url":null,"abstract":"Sentence similarity can be calculated by various measures, but the measures that can use semantic information between the words perform better compared to others. Out of these, the measures which are related to a corpus, or which are knowledge related are more significant in the sentence similarity domains. The applications which use knowledge based measures tend to give more accurate results and are very much in coincidence with human similarity. These measures use path length between the concepts or information content between the words to derive the similarity between the words. Some of the semantic similarity measures generate synonym sets of the words to evaluate similarity, but these measures focus mainly on generating noun or verb synonym sets which can be enhanced by generating all the synonym sets. In this paper, a metric PathM is proposed to calculate word-pair similarity by generating all the synonym sets of the words and it is enhanced to calculate the sentence similarity. The measure is compared with path measure and the results obtained are better in comparison with other measures.","PeriodicalId":175757,"journal":{"name":"2018 Conference on Information and Communication Technology (CICT)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Conference on Information and Communication Technology (CICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOMTECH.2018.8722419","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Sentence similarity can be calculated by various measures, but the measures that can use semantic information between the words perform better compared to others. Out of these, the measures which are related to a corpus, or which are knowledge related are more significant in the sentence similarity domains. The applications which use knowledge based measures tend to give more accurate results and are very much in coincidence with human similarity. These measures use path length between the concepts or information content between the words to derive the similarity between the words. Some of the semantic similarity measures generate synonym sets of the words to evaluate similarity, but these measures focus mainly on generating noun or verb synonym sets which can be enhanced by generating all the synonym sets. In this paper, a metric PathM is proposed to calculate word-pair similarity by generating all the synonym sets of the words and it is enhanced to calculate the sentence similarity. The measure is compared with path measure and the results obtained are better in comparison with other measures.

查看原文本刊更多论文

评价句子相似度的改进路径测度

句子相似度可以通过各种度量来计算，但是可以使用单词之间的语义信息的度量比其他度量表现更好。其中，与语料库相关的度量或与知识相关的度量在句子相似域中更为重要。使用基于知识的测量方法的应用往往会给出更准确的结果，并且非常符合人类的相似性。这些度量使用概念之间的路径长度或单词之间的信息内容来派生单词之间的相似性。一些语义相似度度量会生成单词的同义词集来评估相似度，但这些度量主要集中在生成名词或动词的同义词集上，这些同义词集可以通过生成所有同义词集来增强。本文提出了一种度量方法PathM，通过生成单词的所有同义词集来计算单词对的相似度，并对其进行了改进，用于计算句子的相似度。将该测量方法与路径测量方法进行了比较，结果与其他测量方法相比效果更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 Conference on Information and Communication Technology (CICT)

自引率

0.00%

发文量