{"title":"Construction of discriminant model of web documents suitability as search results","authors":"Hikari Suganuma, Takamitsu Shioi, K. Hatano","doi":"10.1145/3011141.3011204","DOIUrl":null,"url":null,"abstract":"In the research field of Web search engine development, the most important challenge is to extract more information from queries issued to Web search engines. However, the number of words in these queries tends to be small, so that it is difficult to extract information from them. Therefore, some researchers have focused on developing techniques, such as Web spam detection methods, that discriminate Web documents that do not constitute satisfactory search results. In this paper, we propose a method for constructing a discriminant model for determining whether Web documents constitute suitable or unsuitable search results of Web search engines. In contrast to current Web spam detection techniques, our method analyzes the characteristics of the Web documents quantitatively and eliminates the documents that are estimated to be unsuitable search results. Our experimental results show that our discriminant model can help to improve the effectiveness of Web search engines and the efficiency of Web document discriminators as compared to current Web spam detection techniques.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011204","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the research field of Web search engine development, the most important challenge is to extract more information from queries issued to Web search engines. However, the number of words in these queries tends to be small, so that it is difficult to extract information from them. Therefore, some researchers have focused on developing techniques, such as Web spam detection methods, that discriminate Web documents that do not constitute satisfactory search results. In this paper, we propose a method for constructing a discriminant model for determining whether Web documents constitute suitable or unsuitable search results of Web search engines. In contrast to current Web spam detection techniques, our method analyzes the characteristics of the Web documents quantitatively and eliminates the documents that are estimated to be unsuitable search results. Our experimental results show that our discriminant model can help to improve the effectiveness of Web search engines and the efficiency of Web document discriminators as compared to current Web spam detection techniques.