{"title":"一种提高Web检索准确度的新方法","authors":"V. Klyuev, V. Oleshchuk","doi":"10.1109/FUTURETECH.2010.5482671","DOIUrl":null,"url":null,"abstract":"General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each noun and verb of the sentence on the right place on the tree. Taking this information into account, it is possible to solve the ambiguity problem for the query keywords and create the indicative summaries taking into account query words, and semantically related hypernyms and synonyms.","PeriodicalId":380192,"journal":{"name":"2010 5th International Conference on Future Information Technology","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"A Novel Approach to Improve the Accuracy of Web Retrieval\",\"authors\":\"V. Klyuev, V. Oleshchuk\",\"doi\":\"10.1109/FUTURETECH.2010.5482671\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each noun and verb of the sentence on the right place on the tree. Taking this information into account, it is possible to solve the ambiguity problem for the query keywords and create the indicative summaries taking into account query words, and semantically related hypernyms and synonyms.\",\"PeriodicalId\":380192,\"journal\":{\"name\":\"2010 5th International Conference on Future Information Technology\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-05-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 5th International Conference on Future Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/FUTURETECH.2010.5482671\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Conference on Future Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/FUTURETECH.2010.5482671","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
摘要
通用搜索引擎对文本文档使用一种非常简单的视图:它们将文本文档视为单词包。这导致在索引之后,文档的语义丢失。本文介绍了一种提高Web检索精度的新方法。我们利用WordNet和WordNet SenseRelate All Words软件作为主要工具来保存文档句子和用户查询的语义。WordNet中的名词和动词以树状层次结构组织。单词的含义由数字表示,这些数字引用语义树上的节点。分析句子时,计算出句子中每个单词的意思。目标是把句子的每个名词和动词放在树的正确位置上。考虑到这些信息,就有可能解决查询关键字的歧义问题,并创建考虑查询词以及语义相关的上义词和同义词的指示性摘要。
A Novel Approach to Improve the Accuracy of Web Retrieval
General purpose search engines utilize a very simple view on text documents: They consider them as bags of words. It results that after indexing, the semantics of documents is lost. In this paper, we introduce a novel approach to improve the accuracy of Web retrieval. We utilize the WordNet and WordNet SenseRelate All Words Software as main tools to preserve the semantics of the sentences of documents and user queries. Nouns and verbs in the WordNet are organized in the tree hierarchies. The word meanings are presented by numbers that reference to the nodes on the semantic tree. The meaning of each word in the sentence is calculated when the sentence is analyzed. The goal is to put each noun and verb of the sentence on the right place on the tree. Taking this information into account, it is possible to solve the ambiguity problem for the query keywords and create the indicative summaries taking into account query words, and semantically related hypernyms and synonyms.