人类归纳推理与大型语言模型

IF 2.1 3区心理学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Cognitive Systems Research Pub Date : 2023-08-09 DOI:10.1016/j.cogsys.2023.101155

Simon Jerome Han, Keith J. Ransom, Andrew Perfors, Charles Kemp

{"title":"人类归纳推理与大型语言模型","authors":"Simon Jerome Han, Keith J. Ransom, Andrew Perfors, Charles Kemp","doi":"10.1016/j.cogsys.2023.101155","DOIUrl":null,"url":null,"abstract":"<div><p>The impressive recent performance of large language models has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3.5 and GPT-4 to a classic problem in human inductive reasoning known as property induction. Over two experiments, we elicit human judgments on a range of property induction tasks spanning multiple domains. Although GPT-3.5 struggles to capture many aspects of human behavior, GPT-4 is much more successful: for the most part, its performance qualitatively matches that of humans, and the only notable exception is its failure to capture the phenomenon of premise non-monotonicity. Our work demonstrates that property induction allows for interesting comparisons between human and machine intelligence and provides two large datasets that can serve as benchmarks for future work in this vein.</p></div>","PeriodicalId":55242,"journal":{"name":"Cognitive Systems Research","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Inductive reasoning in humans and large language models\",\"authors\":\"Simon Jerome Han, Keith J. Ransom, Andrew Perfors, Charles Kemp\",\"doi\":\"10.1016/j.cogsys.2023.101155\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The impressive recent performance of large language models has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3.5 and GPT-4 to a classic problem in human inductive reasoning known as property induction. Over two experiments, we elicit human judgments on a range of property induction tasks spanning multiple domains. Although GPT-3.5 struggles to capture many aspects of human behavior, GPT-4 is much more successful: for the most part, its performance qualitatively matches that of humans, and the only notable exception is its failure to capture the phenomenon of premise non-monotonicity. Our work demonstrates that property induction allows for interesting comparisons between human and machine intelligence and provides two large datasets that can serve as benchmarks for future work in this vein.</p></div>\",\"PeriodicalId\":55242,\"journal\":{\"name\":\"Cognitive Systems Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2023-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Systems Research\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389041723000839\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Systems Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389041723000839","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 4

摘要

最近大型语言模型令人印象深刻的表现让许多人想知道它们在多大程度上可以作为通用智能模型，或者与人类认知相似。我们通过将GPT-3.5和GPT-4应用于人类归纳推理中的经典问题(称为属性归纳法)来解决这个问题。在两个实验中，我们引出了人类对跨越多个领域的一系列属性归纳任务的判断。尽管GPT-3.5努力捕捉人类行为的许多方面，但GPT-4要成功得多:在大多数情况下，它的性能在质量上与人类相匹配，唯一值得注意的例外是它未能捕捉前提非单调性现象。我们的工作表明，属性归纳允许在人类和机器智能之间进行有趣的比较，并提供两个大型数据集，可以作为这方面未来工作的基准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Inductive reasoning in humans and large language models

The impressive recent performance of large language models has led many to wonder to what extent they can serve as models of general intelligence or are similar to human cognition. We address this issue by applying GPT-3.5 and GPT-4 to a classic problem in human inductive reasoning known as property induction. Over two experiments, we elicit human judgments on a range of property induction tasks spanning multiple domains. Although GPT-3.5 struggles to capture many aspects of human behavior, GPT-4 is much more successful: for the most part, its performance qualitatively matches that of humans, and the only notable exception is its failure to capture the phenomenon of premise non-monotonicity. Our work demonstrates that property induction allows for interesting comparisons between human and machine intelligence and provides two large datasets that can serve as benchmarks for future work in this vein.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cognitive Systems Research 工程技术-计算机：人工智能

CiteScore

9.40

自引率

5.10%

发文量

审稿时长

>12 weeks

期刊介绍： Cognitive Systems Research is dedicated to the study of human-level cognition. As such, it welcomes papers which advance the understanding, design and applications of cognitive and intelligent systems, both natural and artificial. The journal brings together a broad community studying cognition in its many facets in vivo and in silico, across the developmental spectrum, focusing on individual capacities or on entire architectures. It aims to foster debate and integrate ideas, concepts, constructs, theories, models and techniques from across different disciplines and different perspectives on human-level cognition. The scope of interest includes the study of cognitive capacities and architectures - both brain-inspired and non-brain-inspired - and the application of cognitive systems to real-world problems as far as it offers insights relevant for the understanding of cognition. Cognitive Systems Research therefore welcomes mature and cutting-edge research approaching cognition from a systems-oriented perspective, both theoretical and empirically-informed, in the form of original manuscripts, short communications, opinion articles, systematic reviews, and topical survey articles from the fields of Cognitive Science (including Philosophy of Cognitive Science), Artificial Intelligence/Computer Science, Cognitive Robotics, Developmental Science, Psychology, and Neuroscience and Neuromorphic Engineering. Empirical studies will be considered if they are supplemented by theoretical analyses and contributions to theory development and/or computational modelling studies.