{"title":"A Survey on XML Focussed Component Retrieval","authors":"K. Pinel-Sauvagnat, M. Boughanem","doi":"10.5555/1931390.1931430","DOIUrl":null,"url":null,"abstract":"Focussed XML component retrieval is one of the most important challenges in the XML IR field. The aim of the focussed retrieval strategy is to find the most exhaustive and specific element in a path, i.e. to retrieve elements that focus on the user need, without nested elements. In this paper, we introduce a relevance propagation method dealing with focussed XML component retrieval. Many experiments are carried out with the INEX 2005 test suite to define what are the main characteristics of relevant elements in focussed retrieval and to compare such characteristics with those of relevant elements in thorough retrieval (where the aim is to find all relevant elements in the collection). Our main findings are the following. First, a term weighting scheme taking into account the importance of terms in elements and both in collection of elements and collection of documents is useful. Moreover, the introduction of component length as a threshold on results or used in a weighted propagation function improves significantly the results. Third, contextual relevance seems not to be useful, which contradicts results obtained by state-of-the-art methods for non-focussed retrieval. At last, the use of structural hints increases up to 50% performances we obtained when using queries composed only of simple keyword terms.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931430","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Focussed XML component retrieval is one of the most important challenges in the XML IR field. The aim of the focussed retrieval strategy is to find the most exhaustive and specific element in a path, i.e. to retrieve elements that focus on the user need, without nested elements. In this paper, we introduce a relevance propagation method dealing with focussed XML component retrieval. Many experiments are carried out with the INEX 2005 test suite to define what are the main characteristics of relevant elements in focussed retrieval and to compare such characteristics with those of relevant elements in thorough retrieval (where the aim is to find all relevant elements in the collection). Our main findings are the following. First, a term weighting scheme taking into account the importance of terms in elements and both in collection of elements and collection of documents is useful. Moreover, the introduction of component length as a threshold on results or used in a weighted propagation function improves significantly the results. Third, contextual relevance seems not to be useful, which contradicts results obtained by state-of-the-art methods for non-focussed retrieval. At last, the use of structural hints increases up to 50% performances we obtained when using queries composed only of simple keyword terms.