Assessing the Strengths and Weaknesses of Large Language Models

IF 0.7 3区 数学 Q4 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Shalom Lappin
{"title":"Assessing the Strengths and Weaknesses of Large Language Models","authors":"Shalom Lappin","doi":"10.1007/s10849-023-09409-x","DOIUrl":null,"url":null,"abstract":"Abstract The transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. I present an overview of some of the weak arguments that have been presented against LLMs, and I consider several of the more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.","PeriodicalId":48732,"journal":{"name":"Journal of Logic Language and Information","volume":"6 2","pages":"0"},"PeriodicalIF":0.7000,"publicationDate":"2023-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Logic Language and Information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s10849-023-09409-x","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract The transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. I present an overview of some of the weak arguments that have been presented against LLMs, and I consider several of the more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.

Abstract Image

评估大型语言模型的优缺点
驱动聊天机器人和其他人工智能系统的转换器构成了大型语言模型(llm)。这些都是目前科学文献和大众媒体热烈讨论的焦点。讨论的范围从将一般智力和感知归因于法学硕士的夸张说法,到怀疑这些设备只不过是“随机鹦鹉”的观点。我概述了一些反对法学硕士的薄弱论点,并考虑了对这些设备的一些更有说服力的批评。前者严重低估了变压器在复杂、认知意义重大的任务中实现高水平表现所需的微妙归纳推理的能力。在某些情况下,这些论点误解了深度学习的本质。后一种批评指出了变形器学习和表示数据模式的方式的重大局限性。他们还指出了深度神经网络和人类获取自然语言知识的过程之间的重要区别。有必要仔细研究这两种观点,以便对法学硕士的潜力和局限性进行平衡评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Logic Language and Information
Journal of Logic Language and Information COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCEL-LOGIC
CiteScore
1.70
自引率
12.50%
发文量
40
期刊介绍: The scope of the journal is the logical and computational foundations of natural, formal, and programming languages, as well as the different forms of human and mechanized inference. It covers the logical, linguistic, and information-theoretic parts of the cognitive sciences. Examples of main subareas are Intentional Logics including Dynamic Logic; Nonmonotonic Logic and Belief Revision; Constructive Logics; Complexity Issues in Logic and Linguistics; Theoretical Problems of Logic Programming and Resolution; Categorial Grammar and Type Theory; Generalized Quantification; Information-Oriented Theories of Semantic Structure like Situation Semantics, Discourse Representation Theory, and Dynamic Semantics; Connectionist Models of Logical and Linguistic Structures. The emphasis is on the theoretical aspects of these areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信