自动发现和排名的同义词搜索关键字在网络

K. C. Srikantaiah, M. Roopa, N. K. Kumar, K. Venugopal, L. Patnaik
{"title":"自动发现和排名的同义词搜索关键字在网络","authors":"K. C. Srikantaiah, M. Roopa, N. K. Kumar, K. Venugopal, L. Patnaik","doi":"10.1504/IJWS.2014.070668","DOIUrl":null,"url":null,"abstract":"Search engines are an indispensable part of a web user's life. A vast majority of these web users experience difficulties caused by the keyword-based search engines such as inaccurate results for queries and irrelevant URLs even though the given keyword is present in them. Also, relevant URLs may be lost as they may have the synonym of the keyword and not the original one. This condition is known as the polysemy problem. To alleviate these problems, we propose an algorithm called automatic discovery and ranking of synonyms for search keywords in the web (ADRS). The proposed method generates a list of candidate synonyms for individual keywords by employing the relevance factor of the URLs associated with the synonyms. Then, ranking of these candidate synonyms is done using co-occurrence frequencies and various page count-based measures. One of the major advantages of our algorithm is that it is highly scalable which makes it applicable to online data on the dynamic, domain-independent and unstructured World Wide Web. The experimental results show that the best results are obtained using the proposed algorithm with WebJaccard.","PeriodicalId":425045,"journal":{"name":"Int. J. Web Sci.","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Automatic discovery and ranking of synonyms for search keywords in the web\",\"authors\":\"K. C. Srikantaiah, M. Roopa, N. K. Kumar, K. Venugopal, L. Patnaik\",\"doi\":\"10.1504/IJWS.2014.070668\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Search engines are an indispensable part of a web user's life. A vast majority of these web users experience difficulties caused by the keyword-based search engines such as inaccurate results for queries and irrelevant URLs even though the given keyword is present in them. Also, relevant URLs may be lost as they may have the synonym of the keyword and not the original one. This condition is known as the polysemy problem. To alleviate these problems, we propose an algorithm called automatic discovery and ranking of synonyms for search keywords in the web (ADRS). The proposed method generates a list of candidate synonyms for individual keywords by employing the relevance factor of the URLs associated with the synonyms. Then, ranking of these candidate synonyms is done using co-occurrence frequencies and various page count-based measures. One of the major advantages of our algorithm is that it is highly scalable which makes it applicable to online data on the dynamic, domain-independent and unstructured World Wide Web. The experimental results show that the best results are obtained using the proposed algorithm with WebJaccard.\",\"PeriodicalId\":425045,\"journal\":{\"name\":\"Int. J. Web Sci.\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Web Sci.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJWS.2014.070668\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Web Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJWS.2014.070668","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

搜索引擎是网络用户生活中不可或缺的一部分。这些网络用户中的绝大多数都遇到了由基于关键字的搜索引擎造成的困难,比如查询结果不准确,以及即使给定的关键字出现在其中也不相关的url。此外,相关url可能会丢失,因为它们可能具有关键字的同义词而不是原始url。这种情况被称为一词多义问题。为了缓解这些问题,我们提出了一种网络搜索关键词同义词自动发现和排序算法(ADRS)。所提出的方法通过使用与同义词相关联的url的相关性因子,为单个关键字生成候选同义词列表。然后,使用共出现频率和各种基于页面计数的度量对这些候选同义词进行排序。我们的算法的一个主要优点是它具有高度的可扩展性,这使得它适用于动态的、独立于域和非结构化的万维网上的在线数据。实验结果表明,该算法在WebJaccard中取得了较好的效果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automatic discovery and ranking of synonyms for search keywords in the web
Search engines are an indispensable part of a web user's life. A vast majority of these web users experience difficulties caused by the keyword-based search engines such as inaccurate results for queries and irrelevant URLs even though the given keyword is present in them. Also, relevant URLs may be lost as they may have the synonym of the keyword and not the original one. This condition is known as the polysemy problem. To alleviate these problems, we propose an algorithm called automatic discovery and ranking of synonyms for search keywords in the web (ADRS). The proposed method generates a list of candidate synonyms for individual keywords by employing the relevance factor of the URLs associated with the synonyms. Then, ranking of these candidate synonyms is done using co-occurrence frequencies and various page count-based measures. One of the major advantages of our algorithm is that it is highly scalable which makes it applicable to online data on the dynamic, domain-independent and unstructured World Wide Web. The experimental results show that the best results are obtained using the proposed algorithm with WebJaccard.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信