{"title":"葡萄牙语模糊文档检索","authors":"B. H. Storb, R. Wazlawick","doi":"10.1109/NAFIPS.1999.781720","DOIUrl":null,"url":null,"abstract":"This paper reports a model of document retrieval for the Portuguese language, developed from a Miyamoto document-retrieval model. The Miyamoto model is based upon semantic similarity detection of descriptors by co-occurrences. The proposed model may be considered an extension of the Miyamoto model because it considers lexical similarities and expression similarities. Hence, descriptors and queries are expressions, i.e. series of words and connectors (prepositions, etc.). The similarity between words is based on the comparison between possible radicals for the detection of words with identical or similar meanings. The expression similarity is determined by comparing words and connectors using an adaptation of a Bruza and van der Weide (1991) model. The proposed Miyamoto model extension considers both: the determination of a fuzzy thesaurus by a fuzzy index, achieved through lexical descriptor similarities; and the possibility of a non-controlled vocabulary use by the determination of similarities between document descriptors and query expressions. A sample document base was created for the comparison between the models. The results show the usefulness of the proposed model for document retrieval in the Portuguese language.","PeriodicalId":335957,"journal":{"name":"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)","volume":"1995 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Fuzzy document retrieval for Portuguese language\",\"authors\":\"B. H. Storb, R. Wazlawick\",\"doi\":\"10.1109/NAFIPS.1999.781720\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper reports a model of document retrieval for the Portuguese language, developed from a Miyamoto document-retrieval model. The Miyamoto model is based upon semantic similarity detection of descriptors by co-occurrences. The proposed model may be considered an extension of the Miyamoto model because it considers lexical similarities and expression similarities. Hence, descriptors and queries are expressions, i.e. series of words and connectors (prepositions, etc.). The similarity between words is based on the comparison between possible radicals for the detection of words with identical or similar meanings. The expression similarity is determined by comparing words and connectors using an adaptation of a Bruza and van der Weide (1991) model. The proposed Miyamoto model extension considers both: the determination of a fuzzy thesaurus by a fuzzy index, achieved through lexical descriptor similarities; and the possibility of a non-controlled vocabulary use by the determination of similarities between document descriptors and query expressions. A sample document base was created for the comparison between the models. The results show the usefulness of the proposed model for document retrieval in the Portuguese language.\",\"PeriodicalId\":335957,\"journal\":{\"name\":\"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)\",\"volume\":\"1995 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1999-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAFIPS.1999.781720\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"18th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.99TH8397)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAFIPS.1999.781720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
本文报告了一个基于宫本文献检索模型的葡萄牙语文献检索模型。宫本模型是基于共现的描述符语义相似度检测。该模型考虑了词汇相似度和表达相似度,可以看作是宫本模型的扩展。因此,描述符和查询都是表达式,即一系列单词和连接符(介词等)。单词之间的相似性是通过比较可能的词根来检测具有相同或相似意思的单词。表达相似度是通过比较单词和连接来确定的,使用了Bruza和van der Weide(1991)模型的改编。提出的Miyamoto模型扩展考虑了两个方面:通过词汇描述符相似度实现模糊索引确定模糊同义词典;以及通过确定文档描述符和查询表达式之间的相似性来使用非受控词汇表的可能性。为模型之间的比较创建了一个示例文档库。结果表明所提出的模型对葡萄牙语文档检索是有用的。
This paper reports a model of document retrieval for the Portuguese language, developed from a Miyamoto document-retrieval model. The Miyamoto model is based upon semantic similarity detection of descriptors by co-occurrences. The proposed model may be considered an extension of the Miyamoto model because it considers lexical similarities and expression similarities. Hence, descriptors and queries are expressions, i.e. series of words and connectors (prepositions, etc.). The similarity between words is based on the comparison between possible radicals for the detection of words with identical or similar meanings. The expression similarity is determined by comparing words and connectors using an adaptation of a Bruza and van der Weide (1991) model. The proposed Miyamoto model extension considers both: the determination of a fuzzy thesaurus by a fuzzy index, achieved through lexical descriptor similarities; and the possibility of a non-controlled vocabulary use by the determination of similarities between document descriptors and query expressions. A sample document base was created for the comparison between the models. The results show the usefulness of the proposed model for document retrieval in the Portuguese language.