Dynamics of scope ambiguities: comparative analysis of human and large language model performance in Korean

IF 1.3 3区文学 0 LANGUAGE & LINGUISTICS

Lingua Pub Date : 2025-06-20 DOI:10.1016/j.lingua.2025.103998

Myung Hye Yoo , Sanghoun Song

{"title":"Dynamics of scope ambiguities: comparative analysis of human and large language model performance in Korean","authors":"Myung Hye Yoo , Sanghoun Song","doi":"10.1016/j.lingua.2025.103998","DOIUrl":null,"url":null,"abstract":"<div><div>This study investigates how native Korean speakers and large language models (LLMs) resolve scope ambiguities and integrate them with discourse information, focusing on interactions between negation and quantificational phrases (QPs). The objectives were twofold: (i) to determine whether the general preference for surface scope interpretations and integration with discourse information persists in complex syntactic constructions in Korean, which require refined processing, and (ii) to assess how well LLMs comprehend and integrate semantic structures compared with human performance. The results showed a preference for surface scope among Korean speakers but did not rigidly hold against the inverse scope, particularly influenced by object QPs or long-form negation, even when contexts favor an inverse scope. LLMs developed by OpenAI—GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o—align with human judgments, mainly favoring surface scope interpretations when contexts favor the inverse scope. However, when the context supports an inverse scope, discrepancies in the handling of syntactic nuances are evident. This model tends to overgeneralize the inverse scope in specific configurations in which humans typically find the inverse scope more accessible. These findings highlight the challenges of mimicking human linguistic processing and the need for further refinement of language models to improve their interpretive accuracy.</div></div>","PeriodicalId":47955,"journal":{"name":"Lingua","volume":"324 ","pages":"Article 103998"},"PeriodicalIF":1.3000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lingua","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0024384125001238","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

This study investigates how native Korean speakers and large language models (LLMs) resolve scope ambiguities and integrate them with discourse information, focusing on interactions between negation and quantificational phrases (QPs). The objectives were twofold: (i) to determine whether the general preference for surface scope interpretations and integration with discourse information persists in complex syntactic constructions in Korean, which require refined processing, and (ii) to assess how well LLMs comprehend and integrate semantic structures compared with human performance. The results showed a preference for surface scope among Korean speakers but did not rigidly hold against the inverse scope, particularly influenced by object QPs or long-form negation, even when contexts favor an inverse scope. LLMs developed by OpenAI—GPT-3.5 Turbo, GPT-4 Turbo, and GPT-4o—align with human judgments, mainly favoring surface scope interpretations when contexts favor the inverse scope. However, when the context supports an inverse scope, discrepancies in the handling of syntactic nuances are evident. This model tends to overgeneralize the inverse scope in specific configurations in which humans typically find the inverse scope more accessible. These findings highlight the challenges of mimicking human linguistic processing and the need for further refinement of language models to improve their interpretive accuracy.

查看原文本刊更多论文

范围模糊的动态：韩语中人类和大型语言模型性能的比较分析

本研究探讨了母语为韩语者和大型语言模型（llm）如何解决范围歧义并将其与话语信息相结合，重点研究了否定和定量短语（qp）之间的相互作用。研究的目的有两个：(i)确定在复杂的韩语句法结构中，对表面范围解释和话语信息整合的普遍偏好是否仍然存在，这需要进行精细的处理；（ii）评估法学硕士与人类相比在理解和整合语义结构方面的表现如何。结果显示，韩语使用者偏好表面范围，但并不严格反对反向范围，特别是受对象QPs或长形式否定的影响，即使上下文倾向于反向范围。由OpenAI-GPT-3.5 Turbo、GPT-4 Turbo和gpt - 40 -开发的llm与人类的判断一致，当上下文有利于逆范围时，主要倾向于表面范围解释。但是，当上下文支持反向作用域时，处理语法细微差别的差异就很明显了。该模型倾向于在特定配置中过度概括逆作用域，在这种配置中，人们通常会发现逆作用域更容易访问。这些发现强调了模仿人类语言处理的挑战，以及进一步完善语言模型以提高其解释准确性的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Lingua Multiple-

CiteScore

2.50

自引率

9.10%

发文量

审稿时长

24 weeks

期刊介绍： Lingua publishes papers of any length, if justified, as well as review articles surveying developments in the various fields of linguistics, and occasional discussions. A considerable number of pages in each issue are devoted to critical book reviews. Lingua also publishes Lingua Franca articles consisting of provocative exchanges expressing strong opinions on central topics in linguistics; The Decade In articles which are educational articles offering the nonspecialist linguist an overview of a given area of study; and Taking up the Gauntlet special issues composed of a set number of papers examining one set of data and exploring whose theory offers the most insight with a minimal set of assumptions and a maximum of arguments.