An extended fuzzy linguistic approach to generalize boolean information retrieval

Donald H. Kraft, Gloria Bordogna, Gabriella Pasi
{"title":"An extended fuzzy linguistic approach to generalize boolean information retrieval","authors":"Donald H. Kraft,&nbsp;Gloria Bordogna,&nbsp;Gabriella Pasi","doi":"10.1016/1069-0115(94)90032-9","DOIUrl":null,"url":null,"abstract":"<div><p>The generalization of Boolean information retrieval systems is still of interest to scholars. In spite of the fact that commercial systems use Boolean retrieval mechanisms, such systems still have some limitations. One of the main problems is that such systems lack the ability to deal well with imprecision and subjectivity. Previous efforts have led to the introduction of numeric weights to improve both document representations (term weights) and query languages (query weights). However, the use of weights requires a clear knowledge of the semantics of the query in order to translate a fuzzy concept into a precise numeric value. Moreover, it is difficult to model the matching of queries to documents in a way that will preserve the semantics of user queries.</p><p>A linguistic extension has been generated, starting from an existing Boolean weighted retrieval model and formalized within fuzzy set theory, in which numeric query weights are replaced by linguistic descriptors that specify the degree of importance of the terms.</p><p>In the past, query weights were seen as measures of the importance of a specific term in representing the query or as a threshold to aid in matching a specific document to the query. The linguistic extension was originally modeled to view the query weights as a description of the ideal document, so that deviations would be rejected whether a given document had term weights that were too high or too low. This paper looks at an extension to the linguistic model that is not symmetric in that documents with a term weight below the query weight are treated differently than documents with a term weight above the query weight.</p></div>","PeriodicalId":100668,"journal":{"name":"Information Sciences - Applications","volume":"2 3","pages":"Pages 119-134"},"PeriodicalIF":0.0000,"publicationDate":"1994-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/1069-0115(94)90032-9","citationCount":"97","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Sciences - Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/1069011594900329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 97

Abstract

The generalization of Boolean information retrieval systems is still of interest to scholars. In spite of the fact that commercial systems use Boolean retrieval mechanisms, such systems still have some limitations. One of the main problems is that such systems lack the ability to deal well with imprecision and subjectivity. Previous efforts have led to the introduction of numeric weights to improve both document representations (term weights) and query languages (query weights). However, the use of weights requires a clear knowledge of the semantics of the query in order to translate a fuzzy concept into a precise numeric value. Moreover, it is difficult to model the matching of queries to documents in a way that will preserve the semantics of user queries.

A linguistic extension has been generated, starting from an existing Boolean weighted retrieval model and formalized within fuzzy set theory, in which numeric query weights are replaced by linguistic descriptors that specify the degree of importance of the terms.

In the past, query weights were seen as measures of the importance of a specific term in representing the query or as a threshold to aid in matching a specific document to the query. The linguistic extension was originally modeled to view the query weights as a description of the ideal document, so that deviations would be rejected whether a given document had term weights that were too high or too low. This paper looks at an extension to the linguistic model that is not symmetric in that documents with a term weight below the query weight are treated differently than documents with a term weight above the query weight.

一种扩展模糊语言方法推广布尔信息检索
布尔信息检索系统的泛化一直是学者们感兴趣的问题。尽管商业系统使用布尔检索机制,但这种系统仍然存在一些局限性。其中一个主要问题是,这种系统缺乏处理不精确性和主观性的能力。以前的努力已经导致引入数字权重来改进文档表示(术语权重)和查询语言(查询权重)。但是,使用权重需要清楚地了解查询的语义,以便将模糊概念转换为精确的数值。此外,很难以保留用户查询语义的方式对查询与文档的匹配进行建模。从现有的布尔加权检索模型开始,生成了一种语言扩展,并在模糊集理论中形式化,其中数字查询权重被指定术语重要程度的语言描述符取代。在过去,查询权重被看作是衡量特定词在表示查询中的重要性,或者是帮助将特定文档与查询匹配的阈值。语言扩展最初被建模为将查询权重视为对理想文档的描述,因此无论给定文档的术语权重过高还是过低,都会拒绝偏差。本文研究了语言模型的一个非对称扩展,即术语权重低于查询权重的文档与术语权重高于查询权重的文档的处理方式不同。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信