A Survey on XML Focussed Component Retrieval

K. Pinel-Sauvagnat, M. Boughanem
{"title":"A Survey on XML Focussed Component Retrieval","authors":"K. Pinel-Sauvagnat, M. Boughanem","doi":"10.5555/1931390.1931430","DOIUrl":null,"url":null,"abstract":"Focussed XML component retrieval is one of the most important challenges in the XML IR field. The aim of the focussed retrieval strategy is to find the most exhaustive and specific element in a path, i.e. to retrieve elements that focus on the user need, without nested elements. In this paper, we introduce a relevance propagation method dealing with focussed XML component retrieval. Many experiments are carried out with the INEX 2005 test suite to define what are the main characteristics of relevant elements in focussed retrieval and to compare such characteristics with those of relevant elements in thorough retrieval (where the aim is to find all relevant elements in the collection). Our main findings are the following. First, a term weighting scheme taking into account the importance of terms in elements and both in collection of elements and collection of documents is useful. Moreover, the introduction of component length as a threshold on results or used in a weighted propagation function improves significantly the results. Third, contextual relevance seems not to be useful, which contradicts results obtained by state-of-the-art methods for non-focussed retrieval. At last, the use of structural hints increases up to 50% performances we obtained when using queries composed only of simple keyword terms.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Focussed XML component retrieval is one of the most important challenges in the XML IR field. The aim of the focussed retrieval strategy is to find the most exhaustive and specific element in a path, i.e. to retrieve elements that focus on the user need, without nested elements. In this paper, we introduce a relevance propagation method dealing with focussed XML component retrieval. Many experiments are carried out with the INEX 2005 test suite to define what are the main characteristics of relevant elements in focussed retrieval and to compare such characteristics with those of relevant elements in thorough retrieval (where the aim is to find all relevant elements in the collection). Our main findings are the following. First, a term weighting scheme taking into account the importance of terms in elements and both in collection of elements and collection of documents is useful. Moreover, the introduction of component length as a threshold on results or used in a weighted propagation function improves significantly the results. Third, contextual relevance seems not to be useful, which contradicts results obtained by state-of-the-art methods for non-focussed retrieval. At last, the use of structural hints increases up to 50% performances we obtained when using queries composed only of simple keyword terms.
面向XML的组件检索技术综述
集中的XML组件检索是XML IR领域中最重要的挑战之一。集中检索策略的目的是在路径中找到最详尽和最具体的元素,即检索专注于用户需求的元素,没有嵌套元素。本文介绍了一种关联传播方法,用于集中的XML组件检索。使用INEX 2005测试套件进行了许多实验,以定义集中检索中相关元素的主要特征,并将这些特征与彻底检索中相关元素的特征进行比较(目的是在集合中找到所有相关元素)。我们的主要发现如下。首先,一种考虑元素、元素集合和文档集合中术语重要性的术语加权方案是有用的。此外,引入分量长度作为结果的阈值或在加权传播函数中使用可以显著改善结果。第三,上下文相关性似乎没有用,这与最先进的非集中检索方法获得的结果相矛盾。最后,当使用仅由简单关键字组成的查询时,使用结构提示可以将性能提高50%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信