富文档表示在XML检索中的有效性

F. Raja, Mostafa Keikha, M. Rahgozar, F. Oroumchian
{"title":"富文档表示在XML检索中的有效性","authors":"F. Raja, Mostafa Keikha, M. Rahgozar, F. Oroumchian","doi":"10.5555/1931390.1931413","DOIUrl":null,"url":null,"abstract":"Information Retrieval (IR) systems are built with different goals in mind. Some IR systems target high precision that is to have more relevant documents on the first page of their results. Other systems may target high recall that is finding as many references as possible. In this paper we present a method of document representation called RDR to build XML retrieval engines with high specificity; that is finding more relevant documents that are mostly about the query topic. The Rich Document Representation (RDR) is a method of representing the content of a document with logical terms and statements. The conjecture is that since RDR is a better representation of the document content it will produce higher precision. In our implementation, we used the Vector Space model to compute the similarity between the XML elements and queries. Our experiments are conducted on INEX 2004 test collection. The results indicate that the use of richer features such as logical terms or statements for XML retrieval tends to produce more focused retrieval. Therefore it is a suitable document representation when users need only a few more specific references and are more interested in precision than recall.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Effectiveness of Rich Document Representation in XML Retrieval\",\"authors\":\"F. Raja, Mostafa Keikha, M. Rahgozar, F. Oroumchian\",\"doi\":\"10.5555/1931390.1931413\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Information Retrieval (IR) systems are built with different goals in mind. Some IR systems target high precision that is to have more relevant documents on the first page of their results. Other systems may target high recall that is finding as many references as possible. In this paper we present a method of document representation called RDR to build XML retrieval engines with high specificity; that is finding more relevant documents that are mostly about the query topic. The Rich Document Representation (RDR) is a method of representing the content of a document with logical terms and statements. The conjecture is that since RDR is a better representation of the document content it will produce higher precision. In our implementation, we used the Vector Space model to compute the similarity between the XML elements and queries. Our experiments are conducted on INEX 2004 test collection. The results indicate that the use of richer features such as logical terms or statements for XML retrieval tends to produce more focused retrieval. Therefore it is a suitable document representation when users need only a few more specific references and are more interested in precision than recall.\",\"PeriodicalId\":120472,\"journal\":{\"name\":\"RIAO Conference\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RIAO Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5555/1931390.1931413\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

信息检索(IR)系统在构建时考虑了不同的目标。一些红外系统的目标是高精度,即在他们的结果的第一页有更多的相关文件。其他系统可能以高召回率为目标,即找到尽可能多的参考文献。本文提出了一种称为RDR的文档表示方法来构建具有高专用性的XML检索引擎;那就是查找更多与查询主题相关的文档。富文档表示(RDR)是一种用逻辑术语和语句表示文档内容的方法。我们的猜想是,由于RDR是文档内容的更好表示,因此它将产生更高的精度。在我们的实现中,我们使用Vector Space模型来计算XML元素和查询之间的相似性。实验是在INEX 2004测试集上进行的。结果表明,在XML检索中使用更丰富的特性(如逻辑术语或语句)往往会产生更集中的检索。因此,当用户只需要一些更具体的引用并且对准确性比对召回率更感兴趣时,它是一种合适的文档表示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Effectiveness of Rich Document Representation in XML Retrieval
Information Retrieval (IR) systems are built with different goals in mind. Some IR systems target high precision that is to have more relevant documents on the first page of their results. Other systems may target high recall that is finding as many references as possible. In this paper we present a method of document representation called RDR to build XML retrieval engines with high specificity; that is finding more relevant documents that are mostly about the query topic. The Rich Document Representation (RDR) is a method of representing the content of a document with logical terms and statements. The conjecture is that since RDR is a better representation of the document content it will produce higher precision. In our implementation, we used the Vector Space model to compute the similarity between the XML elements and queries. Our experiments are conducted on INEX 2004 test collection. The results indicate that the use of richer features such as logical terms or statements for XML retrieval tends to produce more focused retrieval. Therefore it is a suitable document representation when users need only a few more specific references and are more interested in precision than recall.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信