An extended fuzzy linguistic approach to generalize boolean information retrieval

Information Sciences - Applications Pub Date : 1994-11-01 DOI:10.1016/1069-0115(94)90032-9

Donald H. Kraft, Gloria Bordogna, Gabriella Pasi

{"title":"An extended fuzzy linguistic approach to generalize boolean information retrieval","authors":"Donald H. Kraft, Gloria Bordogna, Gabriella Pasi","doi":"10.1016/1069-0115(94)90032-9","DOIUrl":null,"url":null,"abstract":"<div><p>The generalization of Boolean information retrieval systems is still of interest to scholars. In spite of the fact that commercial systems use Boolean retrieval mechanisms, such systems still have some limitations. One of the main problems is that such systems lack the ability to deal well with imprecision and subjectivity. Previous efforts have led to the introduction of numeric weights to improve both document representations (term weights) and query languages (query weights). However, the use of weights requires a clear knowledge of the semantics of the query in order to translate a fuzzy concept into a precise numeric value. Moreover, it is difficult to model the matching of queries to documents in a way that will preserve the semantics of user queries.</p><p>A linguistic extension has been generated, starting from an existing Boolean weighted retrieval model and formalized within fuzzy set theory, in which numeric query weights are replaced by linguistic descriptors that specify the degree of importance of the terms.</p><p>In the past, query weights were seen as measures of the importance of a specific term in representing the query or as a threshold to aid in matching a specific document to the query. The linguistic extension was originally modeled to view the query weights as a description of the ideal document, so that deviations would be rejected whether a given document had term weights that were too high or too low. This paper looks at an extension to the linguistic model that is not symmetric in that documents with a term weight below the query weight are treated differently than documents with a term weight above the query weight.</p></div>","PeriodicalId":100668,"journal":{"name":"Information Sciences - Applications","volume":"2 3","pages":"Pages 119-134"},"PeriodicalIF":0.0000,"publicationDate":"1994-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/1069-0115(94)90032-9","citationCount":"97","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences - Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/1069011594900329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 97

Abstract

The generalization of Boolean information retrieval systems is still of interest to scholars. In spite of the fact that commercial systems use Boolean retrieval mechanisms, such systems still have some limitations. One of the main problems is that such systems lack the ability to deal well with imprecision and subjectivity. Previous efforts have led to the introduction of numeric weights to improve both document representations (term weights) and query languages (query weights). However, the use of weights requires a clear knowledge of the semantics of the query in order to translate a fuzzy concept into a precise numeric value. Moreover, it is difficult to model the matching of queries to documents in a way that will preserve the semantics of user queries.

A linguistic extension has been generated, starting from an existing Boolean weighted retrieval model and formalized within fuzzy set theory, in which numeric query weights are replaced by linguistic descriptors that specify the degree of importance of the terms.

In the past, query weights were seen as measures of the importance of a specific term in representing the query or as a threshold to aid in matching a specific document to the query. The linguistic extension was originally modeled to view the query weights as a description of the ideal document, so that deviations would be rejected whether a given document had term weights that were too high or too low. This paper looks at an extension to the linguistic model that is not symmetric in that documents with a term weight below the query weight are treated differently than documents with a term weight above the query weight.

查看原文本刊更多论文

一种扩展模糊语言方法推广布尔信息检索

布尔信息检索系统的泛化一直是学者们感兴趣的问题。尽管商业系统使用布尔检索机制，但这种系统仍然存在一些局限性。其中一个主要问题是，这种系统缺乏处理不精确性和主观性的能力。以前的努力已经导致引入数字权重来改进文档表示(术语权重)和查询语言(查询权重)。但是，使用权重需要清楚地了解查询的语义，以便将模糊概念转换为精确的数值。此外，很难以保留用户查询语义的方式对查询与文档的匹配进行建模。从现有的布尔加权检索模型开始，生成了一种语言扩展，并在模糊集理论中形式化，其中数字查询权重被指定术语重要程度的语言描述符取代。在过去，查询权重被看作是衡量特定词在表示查询中的重要性，或者是帮助将特定文档与查询匹配的阈值。语言扩展最初被建模为将查询权重视为对理想文档的描述，因此无论给定文档的术语权重过高还是过低，都会拒绝偏差。本文研究了语言模型的一个非对称扩展，即术语权重低于查询权重的文档与术语权重高于查询权重的文档的处理方式不同。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Sciences - Applications

自引率

0.00%

发文量