Learning Sentence-Level Representations with Predictive Coding

IF 4 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Vladimir Araujo, M. Moens, Álvaro Soto
{"title":"Learning Sentence-Level Representations with Predictive Coding","authors":"Vladimir Araujo, M. Moens, Álvaro Soto","doi":"10.3390/make5010005","DOIUrl":null,"url":null,"abstract":"Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.","PeriodicalId":93033,"journal":{"name":"Machine learning and knowledge extraction","volume":"12 1","pages":"59-77"},"PeriodicalIF":4.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning and knowledge extraction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/make5010005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1

Abstract

Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.
用预测编码学习句子级表示
在深度学习和自然语言处理领域,句子表征学习是一个重要而富有挑战性的课题。最近的方法是在海量文本语料库上预训练大型模型,主要集中在学习语境化单词的表示。因此,这些模型不能生成信息性的句子嵌入,因为它们没有明确地利用连续句子中存在的结构和话语关系。从人类语言处理中获得灵感,这项工作探索了如何通过借鉴预测编码理论的思想来改进预训练模型的句子级表示。具体来说,我们通过自底向上和自顶向下的计算扩展bert风格模型,以预测网络中每个中间层潜在空间中的未来句子。我们对英语和西班牙语的各种基准进行了广泛的实验,旨在评估句子和话语层面的表征以及以语用为重点的评估。我们的结果表明,我们的方法一致地提高了两种语言的句子表示。此外,实验还表明,我们的模型捕捉语篇和语用知识。此外,为了验证所提出的方法,我们进行了消融研究和定性研究,我们验证了预测机制有助于提高表征的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.30
自引率
0.00%
发文量
0
审稿时长
7 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信