结合语言索引提高信息检索系统性能:一种基于机器学习的解决方案

RIAO Conference Pub Date : 2007-05-30 DOI:10.5555/1931390.1931427

Fabienne Moreau, V. Claveau, P. Sébillot

{"title":"结合语言索引提高信息检索系统性能:一种基于机器学习的解决方案","authors":"Fabienne Moreau, V. Claveau, P. Sébillot","doi":"10.5555/1931390.1931427","DOIUrl":null,"url":null,"abstract":"Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution\",\"authors\":\"Fabienne Moreau, V. Claveau, P. Sébillot\",\"doi\":\"10.5555/1931390.1931427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.\",\"PeriodicalId\":120472,\"journal\":{\"name\":\"RIAO Conference\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RIAO Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5555/1931390.1931427\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在同一个信息检索系统中考虑编码形态、句法和语义信息的几种语言索引，这似乎是一个好主意，可以更好地掌握大型非结构化文本集合的语义内容，从而提高系统的性能。因此，如何将这些不同的信息自动有效地组合起来，以优化它们的利用就成为一个问题。为此，我们提出了一种基于原始机器学习的方法，该方法能够从每个单独的语言索引获得的结果列表中的位置确定给定查询集合中的相关文档，同时自动调整其行为以适应查询的特征。本文给出的不同的实验证明了我们的融合方法的好处，即合并结果列表，提供了更平衡的查准率和查全率妥协，从而获得比更好的单个指标更稳定的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution

Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

RIAO Conference

自引率

0.00%

发文量