Self supervised learning and the poverty of the stimulus

IF 2.7 3区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Data & Knowledge Engineering Pub Date : 2023-09-01 DOI:10.1016/j.datak.2023.102208

Csaba Veres , Jennifer Sampson

{"title":"Self supervised learning and the poverty of the stimulus","authors":"Csaba Veres , Jennifer Sampson","doi":"10.1016/j.datak.2023.102208","DOIUrl":null,"url":null,"abstract":"<div>Diathesis alternations are the possible expressions of the arguments of verbs in different, systematically related subcategorization frames. Semantically similar verbs such as spill and spray can behave differently with respect to the alternations they can participate in. For example one can “spill/spray water on the plant”, but while one can “spray the plant with water”, it is odd to say “spill the plant with water”. “Spray” is a verb which can alternate between syntactic frames while “spill” is not alternating. How human speakers learn the difference between such verbs is not clearly understood, because the primary linguistic data (PLD) they receive does not appear sufficient to infer the knowledge required for adult competence. More generally the poverty of the stimulus (POS) hypothesis states that the PLD is not sufficient for a learner to infer full adult competence of language. That is, learning relies on prior constraints introduced by the language faculty. We tested state-of-the-art machine learning models trained by self supervision, and found some evidence that they could in fact learn the correct pattern of acceptability judgement in the locative alternation. However, we argued that this was partially a result of fine-tuning which introduced negative evidence into the learning data, which facilitated shortcut learning. Large language models (LLMs) cannot learn some linguistic facts from normal language data, but they can compensate to some extent by learning spurious correlated features when negative feedback is introduced during the training cycle.</div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X2300068X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Diathesis alternations are the possible expressions of the arguments of verbs in different, systematically related subcategorization frames. Semantically similar verbs such as spill and spray can behave differently with respect to the alternations they can participate in. For example one can “spill/spray water on the plant”, but while one can “spray the plant with water”, it is odd to say “spill the plant with water”. “Spray” is a verb which can alternate between syntactic frames while “spill” is not alternating. How human speakers learn the difference between such verbs is not clearly understood, because the primary linguistic data (PLD) they receive does not appear sufficient to infer the knowledge required for adult competence. More generally the poverty of the stimulus (POS) hypothesis states that the PLD is not sufficient for a learner to infer full adult competence of language. That is, learning relies on prior constraints introduced by the language faculty. We tested state-of-the-art machine learning models trained by self supervision, and found some evidence that they could in fact learn the correct pattern of acceptability judgement in the locative alternation. However, we argued that this was partially a result of fine-tuning which introduced negative evidence into the learning data, which facilitated shortcut learning. Large language models (LLMs) cannot learn some linguistic facts from normal language data, but they can compensate to some extent by learning spurious correlated features when negative feedback is introduced during the training cycle.

查看原文本刊更多论文

自我监督学习和缺乏刺激

变异体是动词论点在不同的、系统相关的子范畴框架中的可能表达。语义相似的动词，如spill和spray，在它们可以参与的变化方面可能表现得不同。例如，一个人可以“向植物泼洒/喷水”，但虽然一个人可以用“用水喷洒植物”，但说“用水泼洒植物”是很奇怪的。“Spray”是一个可以在句法框架之间交替的动词，而“spill”则不是交替的。人类说话者如何学习这些动词之间的差异尚不清楚，因为他们收到的主要语言数据似乎不足以推断出成人能力所需的知识。更普遍地说，刺激贫困（POS）假说表明，PLD不足以让学习者推断出完全的成人语言能力。也就是说，学习依赖于语言教师引入的先验约束。我们测试了由自我监督训练的最先进的机器学习模型，并发现一些证据表明，它们实际上可以在位置交替中学习正确的可接受性判断模式。然而，我们认为，这在一定程度上是微调的结果，微调将负面证据引入到学习数据中，从而促进了快捷学习。大型语言模型（LLM）不能从正常语言数据中学习一些语言事实，但当在训练周期中引入负反馈时，它们可以通过学习虚假的相关特征来进行一定程度的补偿。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Data & Knowledge Engineering 工程技术-计算机：人工智能

CiteScore

5.00

自引率

0.00%

发文量

审稿时长

6 months

期刊介绍： Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.