Construction of discriminant model of web documents suitability as search results

Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services Pub Date : 2016-11-28 DOI:10.1145/3011141.3011204

Hikari Suganuma, Takamitsu Shioi, K. Hatano

{"title":"Construction of discriminant model of web documents suitability as search results","authors":"Hikari Suganuma, Takamitsu Shioi, K. Hatano","doi":"10.1145/3011141.3011204","DOIUrl":null,"url":null,"abstract":"In the research field of Web search engine development, the most important challenge is to extract more information from queries issued to Web search engines. However, the number of words in these queries tends to be small, so that it is difficult to extract information from them. Therefore, some researchers have focused on developing techniques, such as Web spam detection methods, that discriminate Web documents that do not constitute satisfactory search results. In this paper, we propose a method for constructing a discriminant model for determining whether Web documents constitute suitable or unsuitable search results of Web search engines. In contrast to current Web spam detection techniques, our method analyzes the characteristics of the Web documents quantitatively and eliminates the documents that are estimated to be unsuitable search results. Our experimental results show that our discriminant model can help to improve the effectiveness of Web search engines and the efficiency of Web document discriminators as compared to current Web spam detection techniques.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the research field of Web search engine development, the most important challenge is to extract more information from queries issued to Web search engines. However, the number of words in these queries tends to be small, so that it is difficult to extract information from them. Therefore, some researchers have focused on developing techniques, such as Web spam detection methods, that discriminate Web documents that do not constitute satisfactory search results. In this paper, we propose a method for constructing a discriminant model for determining whether Web documents constitute suitable or unsuitable search results of Web search engines. In contrast to current Web spam detection techniques, our method analyzes the characteristics of the Web documents quantitatively and eliminates the documents that are estimated to be unsuitable search results. Our experimental results show that our discriminant model can help to improve the effectiveness of Web search engines and the efficiency of Web document discriminators as compared to current Web spam detection techniques.

查看原文本刊更多论文

网络文档搜索结果适用性判别模型的构建

在Web搜索引擎开发的研究领域中，最大的挑战是如何从Web搜索引擎发出的查询中提取更多的信息。然而，这些查询中的单词数量往往很少，因此很难从中提取信息。因此，一些研究人员专注于开发技术，例如Web垃圾邮件检测方法，以区分不构成令人满意的搜索结果的Web文档。本文提出了一种构建判别模型的方法，以确定Web文档是否构成Web搜索引擎的合适或不合适的搜索结果。与当前的Web垃圾邮件检测技术相比，我们的方法定量地分析了Web文档的特征，并消除了估计为不合适的搜索结果的文档。实验结果表明，与当前的Web垃圾邮件检测技术相比，我们的判别模型可以帮助提高Web搜索引擎的有效性和Web文档判别器的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services

自引率

0.00%

发文量