一种寻找语义相似科学文章的新方法

Journal of Advanced Computer Science and Technology Pub Date : 2015-02-16 DOI:10.14419/JACST.V4I1.4012

Masumeh Islami Nasab, R. Javidan

{"title":"一种寻找语义相似科学文章的新方法","authors":"Masumeh Islami Nasab, R. Javidan","doi":"10.14419/JACST.V4I1.4012","DOIUrl":null,"url":null,"abstract":"Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearson’s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.","PeriodicalId":445404,"journal":{"name":"Journal of Advanced Computer Science and Technology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A new approach for finding semantic similar scientific articles\",\"authors\":\"Masumeh Islami Nasab, R. Javidan\",\"doi\":\"10.14419/JACST.V4I1.4012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearson’s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.\",\"PeriodicalId\":445404,\"journal\":{\"name\":\"Journal of Advanced Computer Science and Technology\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Advanced Computer Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14419/JACST.V4I1.4012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Computer Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14419/JACST.V4I1.4012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

通过计算文章相似度，用户可以在文章集合中找到相似的文章和文档。两个相似的文档对于文本应用程序非常有用，例如文档到文档的相似性搜索、抄袭检查器、重复文本挖掘和文本过滤。本文提出了一种计算冠词语义相似度的新方法。WordNet用于查找单词语义关联。所提出的技术首先对每个部分的相似性进行二对二的比较。然后根据不同部分的加权平均值计算最终结果。将结果与人类得分进行比较，以找出它与皮尔逊相关系数的接近程度。相关系数在87%以上，是该系统的结果。这个系统精确地识别出相似之处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A new approach for finding semantic similar scientific articles

Calculating article similarities enables users to find similar articles and documents in a collection of articles. Two similar documents are extremely helpful for text applications such as document-to-document similarity search, plagiarism checker, text mining for repetition, and text filtering. This paper proposes a new method for calculating the semantic similarities of articles. WordNet is used to find word semantic associations. The proposed technique first compares the similarity of each part two by two. The final results are then calculated based on weighted mean from different parts. Results are compared with human scores to find how it is close to Pearson’s correlation coefficient. The correlation coefficient above 87 percent is the result of the proposed system. The system works precisely in identifying the similarities.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Advanced Computer Science and Technology

自引率

0.00%

发文量