结合语言索引提高信息检索系统性能:一种基于机器学习的解决方案

Fabienne Moreau, V. Claveau, P. Sébillot
{"title":"结合语言索引提高信息检索系统性能:一种基于机器学习的解决方案","authors":"Fabienne Moreau, V. Claveau, P. Sébillot","doi":"10.5555/1931390.1931427","DOIUrl":null,"url":null,"abstract":"Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.","PeriodicalId":120472,"journal":{"name":"RIAO Conference","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution\",\"authors\":\"Fabienne Moreau, V. Claveau, P. Sébillot\",\"doi\":\"10.5555/1931390.1931427\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.\",\"PeriodicalId\":120472,\"journal\":{\"name\":\"RIAO Conference\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-05-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"RIAO Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5555/1931390.1931427\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"RIAO Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5555/1931390.1931427","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在同一个信息检索系统中考虑编码形态、句法和语义信息的几种语言索引,这似乎是一个好主意,可以更好地掌握大型非结构化文本集合的语义内容,从而提高系统的性能。因此,如何将这些不同的信息自动有效地组合起来,以优化它们的利用就成为一个问题。为此,我们提出了一种基于原始机器学习的方法,该方法能够从每个单独的语言索引获得的结果列表中的位置确定给定查询集合中的相关文档,同时自动调整其行为以适应查询的特征。本文给出的不同的实验证明了我们的融合方法的好处,即合并结果列表,提供了更平衡的查准率和查全率妥协,从而获得比更好的单个指标更稳定的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Combining linguistic indexes to improve the performances of information retrieval systems: a machine learning based solution
Taking into account in one same information retrieval system several linguistic indexes encoding morphological, syntactic, and semantic information seems a good idea to better grasp the semantic contents of large unstructured text collections and thus to increase performances of such a system. Therefore the problem raised is of knowing how to automatically and efficiently combine those different information in order to optimize their exploitations. To this end, we propose an original machine learning based method that is able to determine relevant documents in a collection for a given query, from their positions within the result lists obtained from each individual linguistic index, while automatically adapting its behavior to the characteristics of the query. The different experiments that are presented here prove the interest of our fusion method that merges the result lists, which offers more balanced precision-recall compromises and consequently obtains more stable results than those got by the better individual index.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信