Neural Text Embeddings for Information Retrieval

Bhaskar Mitra, Nick Craswell
{"title":"Neural Text Embeddings for Information Retrieval","authors":"Bhaskar Mitra, Nick Craswell","doi":"10.1145/3018661.3022755","DOIUrl":null,"url":null,"abstract":"In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing tasks, such as language modelling and machine translation. This suggests that neural models will also achieve good performance on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using a semantic rather than lexical matching. Although initial iterations of neural models do not outperform traditional lexical-matching baselines, the level of interest and effort in this area is increasing, potentially leading to a breakthrough. The popularity of the recent SIGIR 2016 workshop on Neural Information Retrieval provides evidence to the growing interest in neural models for IR. While recent tutorials have covered some aspects of deep learning for retrieval tasks, there is a significant scope for organizing a tutorial that focuses on the fundamentals of representation learning for text retrieval. The goal of this tutorial will be to introduce state-of-the-art neural embedding models and bridge the gap between these neural models with early representation learning approaches in IR (e.g., LSA). We will discuss some of the key challenges and insights in making these models work in practice, and demonstrate one of the toolsets available to researchers interested in this area.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"61","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3022755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 61

Abstract

In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing tasks, such as language modelling and machine translation. This suggests that neural models will also achieve good performance on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using a semantic rather than lexical matching. Although initial iterations of neural models do not outperform traditional lexical-matching baselines, the level of interest and effort in this area is increasing, potentially leading to a breakthrough. The popularity of the recent SIGIR 2016 workshop on Neural Information Retrieval provides evidence to the growing interest in neural models for IR. While recent tutorials have covered some aspects of deep learning for retrieval tasks, there is a significant scope for organizing a tutorial that focuses on the fundamentals of representation learning for text retrieval. The goal of this tutorial will be to introduce state-of-the-art neural embedding models and bridge the gap between these neural models with early representation learning approaches in IR (e.g., LSA). We will discuss some of the key challenges and insights in making these models work in practice, and demonstrate one of the toolsets available to researchers interested in this area.
用于信息检索的神经文本嵌入
在过去的几年里,神经表示学习方法在许多自然语言处理任务上取得了很好的表现,例如语言建模和机器翻译。这表明,神经模型在信息检索(IR)任务上也能取得良好的性能,比如相关度排序,通过使用语义匹配而不是词汇匹配来解决查询文档词汇不匹配问题。尽管神经模型的初始迭代并没有超越传统的词汇匹配基线,但人们对这一领域的兴趣和努力正在增加,这可能会带来突破。最近的SIGIR 2016神经信息检索研讨会的流行提供了对IR神经模型日益增长的兴趣的证据。虽然最近的教程已经涵盖了检索任务的深度学习的一些方面,但是组织一个专注于文本检索的表示学习基础的教程还有很大的空间。本教程的目标将是介绍最先进的神经嵌入模型,并弥合这些神经模型与早期表征学习方法在IR(例如,LSA)之间的差距。我们将讨论使这些模型在实践中工作的一些关键挑战和见解,并展示对该领域感兴趣的研究人员可用的工具集之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信