语言模型常识性推理的心理语言学诊断

Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022) Pub Date : 1900-01-01 DOI:10.18653/v1/2022.csrr-1.3

Yan Cong

{"title":"语言模型常识性推理的心理语言学诊断","authors":"Yan Cong","doi":"10.18653/v1/2022.csrr-1.3","DOIUrl":null,"url":null,"abstract":"Neural language models have attracted a lot of attention in the past few years. More and more researchers are getting intrigued by how language models encode commonsense, specifically what kind of commonsense they understand, and why they do. This paper analyzed neural language models’ understanding of commonsense pragmatics (i.e., implied meanings) through human behavioral and neurophysiological data. These psycholinguistic tests are designed to draw conclusions based on predictive responses in context, making them very well suited to test word-prediction models such as BERT in natural settings. They can provide the appropriate prompts and tasks to answer questions about linguistic mechanisms underlying predictive responses. This paper adopted psycholinguistic datasets to probe language models’ commonsense reasoning. Findings suggest that GPT-3’s performance was mostly at chance in the psycholinguistic tasks. We also showed that DistillBERT had some understanding of the (implied) intent that’s shared among most people. Such intent is implicitly reflected in the usage of conversational implicatures and presuppositions. Whether or not fine-tuning improved its performance to human-level depends on the type of commonsense reasoning.","PeriodicalId":166496,"journal":{"name":"Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Psycholinguistic Diagnosis of Language Models’ Commonsense Reasoning\",\"authors\":\"Yan Cong\",\"doi\":\"10.18653/v1/2022.csrr-1.3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural language models have attracted a lot of attention in the past few years. More and more researchers are getting intrigued by how language models encode commonsense, specifically what kind of commonsense they understand, and why they do. This paper analyzed neural language models’ understanding of commonsense pragmatics (i.e., implied meanings) through human behavioral and neurophysiological data. These psycholinguistic tests are designed to draw conclusions based on predictive responses in context, making them very well suited to test word-prediction models such as BERT in natural settings. They can provide the appropriate prompts and tasks to answer questions about linguistic mechanisms underlying predictive responses. This paper adopted psycholinguistic datasets to probe language models’ commonsense reasoning. Findings suggest that GPT-3’s performance was mostly at chance in the psycholinguistic tasks. We also showed that DistillBERT had some understanding of the (implied) intent that’s shared among most people. Such intent is implicitly reflected in the usage of conversational implicatures and presuppositions. Whether or not fine-tuning improved its performance to human-level depends on the type of commonsense reasoning.\",\"PeriodicalId\":166496,\"journal\":{\"name\":\"Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2022.csrr-1.3\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.csrr-1.3","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

在过去的几年里，神经语言模型引起了人们的广泛关注。越来越多的研究人员对语言模型如何编码常识产生了兴趣，特别是它们理解什么样的常识，以及它们为什么这样做。本文通过人类行为和神经生理数据分析了神经语言模型对常识性语用(即隐含意义)的理解。这些心理语言学测试的目的是根据情境中的预测反应得出结论，这使得它们非常适合在自然环境中测试单词预测模型，如BERT。他们可以提供适当的提示和任务来回答有关预测反应背后的语言机制的问题。本文采用心理语言学数据集来探讨语言模型的常识推理。研究结果表明，GPT-3在心理语言任务中的表现大多是偶然的。我们还表明，蒸馏器对大多数人共有的(隐含的)意图有一定的理解。这种意图隐含地反映在会话含义和预设的使用中。微调是否将其性能提高到人类水平取决于常识推理的类型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Psycholinguistic Diagnosis of Language Models’ Commonsense Reasoning

Neural language models have attracted a lot of attention in the past few years. More and more researchers are getting intrigued by how language models encode commonsense, specifically what kind of commonsense they understand, and why they do. This paper analyzed neural language models’ understanding of commonsense pragmatics (i.e., implied meanings) through human behavioral and neurophysiological data. These psycholinguistic tests are designed to draw conclusions based on predictive responses in context, making them very well suited to test word-prediction models such as BERT in natural settings. They can provide the appropriate prompts and tasks to answer questions about linguistic mechanisms underlying predictive responses. This paper adopted psycholinguistic datasets to probe language models’ commonsense reasoning. Findings suggest that GPT-3’s performance was mostly at chance in the psycholinguistic tasks. We also showed that DistillBERT had some understanding of the (implied) intent that’s shared among most people. Such intent is implicitly reflected in the usage of conversational implicatures and presuppositions. Whether or not fine-tuning improved its performance to human-level depends on the type of commonsense reasoning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)

自引率

0.00%

发文量