Gonzalo Martínez, Javier Conde, Pedro Reviriego, Marc Brysbaert
{"title":"Simulating lexical decision times with large language models to supplement megastudies and crowdsourcing.","authors":"Gonzalo Martínez, Javier Conde, Pedro Reviriego, Marc Brysbaert","doi":"10.3758/s13428-025-02829-6","DOIUrl":null,"url":null,"abstract":"<p><p>Megastudies and crowdsourcing studies are a rich source of information for word recognition research because they provide processing times for thousands of words. However, the high cost makes it impossible to include all words of interest and all relevant participant groups. This study explores the potential of fine-tuned large language models (LLMs) to generate lexical decision times (RTs) similar to those of humans. Building on recent findings that LLMs can accurately estimate word features, we fine-tuned GPT-4o mini with 3000 words from a megastudy. We then gave the model the task of generating RT estimates for the remaining words in the dataset. Our findings showed a high correlation between AI-generated and observed RTs. We discuss three applications: (1) estimating missing RT data, where AI can fill in gaps for words missing in some megastudies, (2) verifying results of virtual experiments, where AI-generated data can provide an additional layer of validation for results of virtual experiments, and (3) optimizing human data collection, as researchers can run simulations before conducting studies with humans. While AI-generated RTs are not a replacement for human data, they have the potential to increase the flexibility and efficiency of megastudy research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 10","pages":"294"},"PeriodicalIF":3.9000,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02829-6","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Megastudies and crowdsourcing studies are a rich source of information for word recognition research because they provide processing times for thousands of words. However, the high cost makes it impossible to include all words of interest and all relevant participant groups. This study explores the potential of fine-tuned large language models (LLMs) to generate lexical decision times (RTs) similar to those of humans. Building on recent findings that LLMs can accurately estimate word features, we fine-tuned GPT-4o mini with 3000 words from a megastudy. We then gave the model the task of generating RT estimates for the remaining words in the dataset. Our findings showed a high correlation between AI-generated and observed RTs. We discuss three applications: (1) estimating missing RT data, where AI can fill in gaps for words missing in some megastudies, (2) verifying results of virtual experiments, where AI-generated data can provide an additional layer of validation for results of virtual experiments, and (3) optimizing human data collection, as researchers can run simulations before conducting studies with humans. While AI-generated RTs are not a replacement for human data, they have the potential to increase the flexibility and efficiency of megastudy research.
期刊介绍:
Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.