A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP

Proceedings of the Australasian Computer Science Week Multiconference Pub Date : 2020-01-29 DOI:10.1145/3373017.3373028

Munazza Zaib, Quan Z. Sheng, W. Zhang

{"title":"A Short Survey of Pre-trained Language Models for Conversational AI-A New Age in NLP","authors":"Munazza Zaib, Quan Z. Sheng, W. Zhang","doi":"10.1145/3373017.3373028","DOIUrl":null,"url":null,"abstract":"Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.","PeriodicalId":297760,"journal":{"name":"Proceedings of the Australasian Computer Science Week Multiconference","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Australasian Computer Science Week Multiconference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373017.3373028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 50

Abstract

Building a dialogue system that can communicate naturally with humans is a challenging yet interesting problem of agent-based computing. The rapid growth in this area is usually hindered by the long-standing problem of data scarcity as these systems are expected to learn syntax, grammar, decision making, and reasoning from insufficient amounts of task-specific dataset. The recently introduced pre-trained language models have the potential to address the issue of data scarcity and bring considerable advantages by generating contextualized word embeddings. These models are considered counterpart of ImageNet in NLP and have demonstrated to capture different facets of language such as hierarchical relations, long-term dependency, and sentiment. In this short survey paper, we discuss the recent progress made in the field of pre-trained language models. We also deliberate that how the strengths of these language models can be leveraged in designing more engaging and more eloquent conversational agents. This paper, therefore, intends to establish whether these pre-trained models can overcome the challenges pertinent to dialogue systems, and how their architecture could be exploited in order to overcome these challenges. Open challenges in the field of dialogue systems have also been deliberated.

查看原文本刊更多论文

会话人工智能的预训练语言模型综述——NLP的新时代

构建一个能够与人类自然交流的对话系统是基于智能体计算的一个具有挑战性但又有趣的问题。这一领域的快速发展通常受到长期存在的数据稀缺问题的阻碍，因为这些系统需要从数量不足的特定任务数据集中学习语法、语法、决策和推理。最近引入的预训练语言模型有可能解决数据稀缺的问题，并通过生成上下文化的词嵌入带来相当大的优势。这些模型被认为是NLP中的ImageNet的对应，并且已经被证明可以捕获语言的不同方面，如层次关系、长期依赖和情感。在这篇简短的调查论文中，我们讨论了预训练语言模型领域的最新进展。我们还考虑了如何利用这些语言模型的优势来设计更有吸引力和更有说服力的会话代理。因此，本文打算确定这些预训练的模型是否能够克服与对话系统相关的挑战，以及如何利用它们的架构来克服这些挑战。还审议了对话系统领域的公开挑战。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Australasian Computer Science Week Multiconference

自引率

0.00%

发文量