Karina Tachihara , Madison Barker , Beverly Cotter , Taylor Hayes , John Henderson , Adrian Zhou , Fernanda Ferreira
{"title":"Planning to be incremental: Scene descriptions reveal meaningful clustering in language production","authors":"Karina Tachihara , Madison Barker , Beverly Cotter , Taylor Hayes , John Henderson , Adrian Zhou , Fernanda Ferreira","doi":"10.1016/j.cognition.2025.106330","DOIUrl":null,"url":null,"abstract":"<div><div>How do speakers plan complex descriptions and then execute those plans? In this work, we attempt to answer this question by asking subjects to describe complex visual scenes. We posit that speakers begin planning by organizing the scene into meaningful clusters or groupings of objects. Speakers describe the scene cluster by cluster, allowing for some planning time between each cluster. To test these ideas, in a preregistered study 30 participants described 30 indoor and outdoor scenes while their speech was recorded. Physical distance was calculated by identifying the centroid point of each object and then computing the Euclidean distance between centroid points for every object pair. Semantic distance was calculated using ConceptNet Numberbatch to obtain the semantic similarity between object labels. A clustering algorithm was then applied to establish the appropriate number of clusters per scene and to assign objects to each cluster. We observed that, consistent with our hypothesis, objects separated by shorter physical distances and objects that are semantically more similar were discussed in closer temporal proximity in the verbal descriptions. In addition, word productions that involved jumping from one cluster to another took longer to initiate than those associated with the same cluster. We conclude that speakers address the linearization problem by establishing clusters of objects and using them to facilitate incremental planning. This approach treats multiutterance language production as a type of foraging behavior, where people balance exploration and exploitation.</div></div>","PeriodicalId":48455,"journal":{"name":"Cognition","volume":"266 ","pages":"Article 106330"},"PeriodicalIF":2.8000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognition","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010027725002719","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
How do speakers plan complex descriptions and then execute those plans? In this work, we attempt to answer this question by asking subjects to describe complex visual scenes. We posit that speakers begin planning by organizing the scene into meaningful clusters or groupings of objects. Speakers describe the scene cluster by cluster, allowing for some planning time between each cluster. To test these ideas, in a preregistered study 30 participants described 30 indoor and outdoor scenes while their speech was recorded. Physical distance was calculated by identifying the centroid point of each object and then computing the Euclidean distance between centroid points for every object pair. Semantic distance was calculated using ConceptNet Numberbatch to obtain the semantic similarity between object labels. A clustering algorithm was then applied to establish the appropriate number of clusters per scene and to assign objects to each cluster. We observed that, consistent with our hypothesis, objects separated by shorter physical distances and objects that are semantically more similar were discussed in closer temporal proximity in the verbal descriptions. In addition, word productions that involved jumping from one cluster to another took longer to initiate than those associated with the same cluster. We conclude that speakers address the linearization problem by establishing clusters of objects and using them to facilitate incremental planning. This approach treats multiutterance language production as a type of foraging behavior, where people balance exploration and exploitation.
期刊介绍:
Cognition is an international journal that publishes theoretical and experimental papers on the study of the mind. It covers a wide variety of subjects concerning all the different aspects of cognition, ranging from biological and experimental studies to formal analysis. Contributions from the fields of psychology, neuroscience, linguistics, computer science, mathematics, ethology and philosophy are welcome in this journal provided that they have some bearing on the functioning of the mind. In addition, the journal serves as a forum for discussion of social and political aspects of cognitive science.