{"title":"Comparing automated gaze classifiers in infant looking studies: Accuracy and vulnerability to environmental factors.","authors":"Hiromichi Hagihara, Nanako Kimura, Lorijn Zaadnoordijk, Rei Yasuda, Rhodri Cusack, Sho Tsuji","doi":"10.3758/s13428-026-03040-x","DOIUrl":"https://doi.org/10.3758/s13428-026-03040-x","url":null,"abstract":"<p><p>We evaluated the performance and environmental robustness of three state-of-the-art gaze classification algorithms designed for infant looking-time research: iCatcher+, OWLET, and an Amazon Rekognition-based model. Gaze classifications for each algorithm were compared to human-coded data using a novel dataset (N = 47), and iCatcher+ demonstrated the highest agreement (78.4-85.4%). We then investigated the effect of environmental factors commonly encountered in webcam-based home experiments with infants. We quantified six factors: distance to the camera; infants' left-right offset; facial rotation; facial movement; facial brightness; and spatial variability in facial brightness. Suboptimal recording conditions led to performance degradation for all algorithms. Even iCatcher+, while the most accurate overall, was susceptible, particularly when facial illumination was uneven (i.e., strong brightness variability) or when the head position moved substantially. These findings provide practical insights into the selection and deployment of gaze classification tools for infant research, and can be used to optimize instructions for participants in home webcam experiments. This study contributes to improving methodological transparency and reliability in remote infant eye-tracking research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147855851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Hellmann, Michael Zehetleitner, Manuel Rausch
{"title":"dynConfiR: An R package for sequential sampling models of decision confidence.","authors":"Sebastian Hellmann, Michael Zehetleitner, Manuel Rausch","doi":"10.3758/s13428-026-03013-0","DOIUrl":"https://doi.org/10.3758/s13428-026-03013-0","url":null,"abstract":"<p><p>The modeling of response times using sequential sampling models has a long history. Because choices, confidence judgments, and reaction times are closely linked in perceptual decisions, it seems only natural to simultaneously model these three outcome variables of a decision. In the package dynConfiR, we implemented various sequential sampling models of choice, response time, and decision confidence in R. This paper gives an overview of the package, which provides probability density functions as well as high-level functions for fitting parameters to empirical data, prediction of reaction time and response distributions, and simulation of artificial data sets. We describe the mathematical specifications of the implemented models and provide detailed descriptions of the implemented likelihood functions. In addition, we outline the workflow for applying the model to empirical data step-by-step: data preprocessing, model fitting, model prediction, quantitative model comparison, and visual assessment of model predictions. Finally, we present results from parameter and model recovery analyses and assess the precision of probability density calculations, illustrating the robustness of the implemented computations. Offering intuitive usability and high flexibility, the package is targeted at researchers in the fields of decision-making and confidence and does not require expert-level programming skills.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147855878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Toni A Bechtold, Rafael Jerjen, Florian Hoesl, Lorenz Kilchenmann, Olivier Senn
{"title":"The Lucerne Groove Library: An audio stimulus corpus of 444 short Western popular music drum and bass patterns with behavioural, structural, and audio measurements.","authors":"Toni A Bechtold, Rafael Jerjen, Florian Hoesl, Lorenz Kilchenmann, Olivier Senn","doi":"10.3758/s13428-026-02989-z","DOIUrl":"https://doi.org/10.3758/s13428-026-02989-z","url":null,"abstract":"<p><p>This study presents an audio stimulus corpus consisting of 444 Western popular music drum and bass patterns called the Lucerne Groove Library. Common requirements for stimuli in music psychological research, particularly studies on the experience of groove, are outlined, followed by a description of how these criteria are addressed in the presented corpus. For example, the corpus is designed to combine ecological validity with high manipulability, facilitating the use of the corpus in a variety of experimental settings. The methods section provides a detailed account of material selection, audio creation, and measures. Ground-truth behavioural data for the stimuli were obtained through a listening experiment (e.g., participant's ratings on the urge to move in response to the stimuli, and style assignments ). Several structural data (e.g., tempo and event density) and audio features (e.g., low-frequency sub-band flux) were measured for each stimulus, and an overview over all data is provided. Potential applications of the corpus in future studies are discussed, including variables along which stimuli can be selected or manipulated. The corpus is published in different formats and with a range of accompanying data, all of which can be found in the online repository.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CLAP: A battery of measures to estimate language proficiency in native Chinese speakers.","authors":"Anna H S Heng, Melvin J Yap","doi":"10.3758/s13428-026-03028-7","DOIUrl":"https://doi.org/10.3758/s13428-026-03028-7","url":null,"abstract":"<p><p>Due to increasing interest in investigating the effects of individual differences in language proficiency on word recognition, there is a growing need for objective, reliable, and valid proficiency measures for adult native speakers. While such measures are available for English (Andrews & Hersch, 2010), few comparable instruments are available for simplified Chinese, hindering rigorous investigation into individual differences in word recognition. To address this, we developed Chinese Language Assessments of Proficiency (CLAP). CLAP can be completed within 30 min and comprises: (1) a dictation test with 20 short-answer questions, (2) a vocabulary knowledge test with 30 multiple-choice questions, and (3) a reading comprehension test with 24 multiple-choice questions. Test development proceeded in two phases. In Phase 1, test items were developed and pilot-tested with 50 native Chinese participants. Item response theory analyses were performed to select items for the final tests. In Phase 2, the final tests were validated among 200 participants. The three tests demonstrated good reliability (McDonald's ω and Cronbach's α > 0.8), acceptable test homogeneity (average interitem correlations: .17 - .29), and good convergent validity, evidenced by significant, positive correlations with scores on two validation measures: (1) LEXTALE-CH, a character-based proficiency test (Chan & Chang, 2018); and (2) Hanyu Shuiping Kaoshi, a test assessing grammar, vocabulary and reading skills (HSK official website). Exploratory Factor Analyses demonstrated that CLAP tests and validation measures all loaded onto a single factor. CLAP provides a practical tool for studying individual differences in language proficiency or screening participants based on proficiency.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gerard Campbell, Graeme Nicholls, Rebecca Hart, Richard J Allen, Claudia C von Bastian, Melanie R Burke, Mario Parra Rodriguez, Louise A Brown Nicholls
{"title":"Development and validation of an AI-generated real-world object stimuli set.","authors":"Gerard Campbell, Graeme Nicholls, Rebecca Hart, Richard J Allen, Claudia C von Bastian, Melanie R Burke, Mario Parra Rodriguez, Louise A Brown Nicholls","doi":"10.3758/s13428-026-03021-0","DOIUrl":"https://doi.org/10.3758/s13428-026-03021-0","url":null,"abstract":"<p><p>The availability of real-world object stimuli that meet researchers' requirements is an ongoing challenge in visual cognition research. While numerous manually curated object stimulus sets exist, stimulus features such as size, color, and orientation tend to vary widely within a given set and may not be suitable for studies with specific requirements regarding these parameters. However, recent advances in artificial intelligence (AI) can facilitate the generation of highly realistic, custom-made stimuli. Building on these developments, the present study aimed to share a set of 200 AI-generated images of everyday objects for research use. The objects were oriented as though 'placed' on a flat surface, such that they could be naturally embedded in virtual scenes. Moreover, they were created in greyscale and suitable for rendering in different colors. Here, we report the method used to efficiently generate the stimuli, as well as the results from a validation study in which we assessed the nameability, perceived realism and familiarity of the stimuli in a sample of 45 younger (18-35) and 45 older (65-85) adults. As anticipated, the majority of the stimuli were rated highly across all three measures, and no significant age differences were observed. The results thus validated most of the stimuli for future research. The stimuli, each in seven colors, and the corresponding validation scores are openly available for future use. Low-level image statistics of mean brightness and contrast for each image are also included in the dataset.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13144173/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sarah Bate, Emma Portch, Olivia Dark, Rachel Bennetts
{"title":"A highly sensitive famous face recognition paradigm for prosopagnosia screening.","authors":"Sarah Bate, Emma Portch, Olivia Dark, Rachel Bennetts","doi":"10.3758/s13428-026-03033-w","DOIUrl":"https://doi.org/10.3758/s13428-026-03033-w","url":null,"abstract":"<p><p>Famous face recognition tasks have traditionally been used to diagnose prosopagnosia, offering striking examples of the inability to recognise highly familiar faces. Yet, their popularity has dwindled with the development of standardised unfamiliar face recognition tasks that are less cumbersome to administer and can readily be implemented online. Here, we argue that there is a danger of omitting measures of familiar face recognition from prosopagnosia screening: not only may this challenge the very definition of the condition, but, with some adjustments, famous face recognition tasks can continue to offer highly sensitive measures of everyday face recognition ability. Thus, we developed and evaluated an online, automated famous face recognition paradigm that can readily be implemented into large-scale screening programmes. This task improves on previous designs by (a) eliminating extrinsic cues to identity by including distractor as well as familiar faces, (b) supporting the use of unseen rather than \"iconic\" images of celebrities, and (c) offering a method for automated scoring. Multiple versions of the task were found to have high sensitivity in the detection of developmental prosopagnosia. When required, sub-scores collected from the same paradigm can be used to assess performance at different stages of recognition and identification, helping to probe more precise loci of impairment. The latter is important to guide the diagnosis of more complex cases and, potentially, their remediation.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13144195/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Why the binary latent growth model is not a special case of the ordinal latent growth model: Theoretical arguments and empirical evidence.","authors":"Kyungmin Lim, Su-Young Kim","doi":"10.3758/s13428-026-03043-8","DOIUrl":"https://doi.org/10.3758/s13428-026-03043-8","url":null,"abstract":"<p><p>In the structural equation modeling framework, binary variable models are generally considered a special case of ordinal variable models, as both involve similar scale assignment processes. However, the scaling processes of the two model types differ, with these differences becoming increasingly pronounced in the context of latent growth models (LGMs). To define scale units, the two types of LGMs-specifically, one with ordinal variables and the other with binary variables-depend on different observed scale references, such as thresholds and standard deviations, which are derived from observed categorical variables. Applying distinct observed scale references to binary and ordinal LGMs results in systematic differences in the scale units of their corresponding latent response variables. Consequently, in binary LGMs, the transformed latent response variables used for model estimation may fail to accurately reflect the corresponding population information, and as a result, their parameter estimates are more likely to be systematically biased than those obtained from ordinal LGMs. This study investigates the impact of these differences on estimating ordinal and binary LGMs and underscores potential estimation concerns in binary LGMs from both theoretical and empirical perspectives.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13144242/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147832495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel bias-free approach for robust perceptual threshold estimation.","authors":"Luca Tarasi, Margherita Covelli, Caterina Bertini, Vincenzo Romei","doi":"10.3758/s13428-026-03038-5","DOIUrl":"https://doi.org/10.3758/s13428-026-03038-5","url":null,"abstract":"<p><p>Perceptual threshold estimation stands as a fundamental variable manipulation in psychology and neuroscience, fine-tuning sensory input intensity to align perceptual precision across individuals. Current methods, such as constant stimuli and staircase procedures, inaccurately assume that perceptual thresholds are influenced solely by sensitivity, overlooking the role of response biases and consequently leading to unreliable estimates. The underlying reason is that classical methods traditionally focus on hit rates-the correct detection of stimuli-while neglecting false alarms, where participants incorrectly report a stimulus when none is present. According to signal detection theory, hit rates can arise from genuine sensitivity or a liberal response criterion, making it crucial to account for both measures. To overcome this confound, we developed bias-free versions of these methods by including target-absent trials to factor in decisional bias during threshold evaluation. Across 74 participants, we robustly demonstrated that widely used classical procedures overestimate perceptual thresholds due to uncontrolled interindividual variability in participant criterion. Conversely, the bias-free staircase procedure achieved reliable sensitivity thresholds, effectively mitigating the influence of decisional bias. We strongly advocate for the adoption of this straightforward, bias-free approach. Its effectiveness, simplicity, and quick execution make it highly feasible for widespread use in enhancing the reliability of threshold estimation and reducing variability in future scientific investigations across various fields.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13144207/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Maksim Rudnev, Ana Lucia Rodriguez de la Rosa, Ryan Barrett, Nicholas A Christakis, Igor Grossmann
{"title":"Measuring intellectual humility through situated behavior: An alternative to dispositional self-reports.","authors":"Maksim Rudnev, Ana Lucia Rodriguez de la Rosa, Ryan Barrett, Nicholas A Christakis, Igor Grossmann","doi":"10.3758/s13428-026-03016-x","DOIUrl":"https://doi.org/10.3758/s13428-026-03016-x","url":null,"abstract":"<p><p>Although psychological theory views behavior as an interaction between person and situation, the measurement of metacognition often relies on situation-free, abstract self-views. To align measurement with interactionist frameworks, we proposed a situated approach, using intellectual humility (IH)-recognizing one's limits and fallibility-as a proof-of-concept. We validated this approach across diverse populations: English- and Spanish-speaking North Americans (N = 633) and adults from 136 rural Honduran villages (N = 2567). Rather than rating abstract tendencies, participants reconstructed three recent disagreements (freely chosen, wrong, and right) and reported specific behaviors via branching binary probes. This method demonstrated cross-cultural coherence of the IH construct while capturing substantial situational variability. Notably, 74-81% of variance occurred within persons: IH expression fluctuated significantly based on epistemic context (being wrong vs. right) and social dynamics (partner status). These situational effects also fully explained gender differences in the Honduran sample. The situated approach showed efficiency outside Western, educated contexts and helps overcome the humility paradox-wherein the least intellectually humble overclaim their humility. We discuss four principles for aligning measurement with theory-contextual specificity, sampling from actual experiences via event reconstruction, accessibility across diverse populations via branching probes, and modeling within-person variability-offering a framework for assessing metacognitive and self-regulatory constructs beyond the constraints of static dispositional measures.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From theory to practice: A comprehensive toolkit for Q-matrix validation in cognitive diagnosis.","authors":"Haijiang Qin, Enhao Bao, Lei Guo","doi":"10.3758/s13428-026-02991-5","DOIUrl":"https://doi.org/10.3758/s13428-026-02991-5","url":null,"abstract":"<p><p>Cognitive diagnostic assessment (CDA) enables direct exploration of participants' cognitive structures or psychological latent traits (referred to as attributes), offering unique advantages within psychological methodologies. The Q-matrix, which delineates the relationship between items and attributes in CDA, is crucial for accurate diagnosis. However, ensuring the accuracy of the Q-matrix in practical applications is often challenging. Constructing a Q-matrix typically requires extensive calibration efforts from both test developers and domain experts, and even then, issues of accuracy and subjectivity remain. Although various Q-matrix validation methods have been developed to improve its quality, their implementation often presents a steep technical barrier for typical psychological researchers. These challenges have limited the broader application of CDA in psychological research. This paper provides a systematic review of Q-matrix validation methods under saturated cognitive diagnosis models (CDMs) and introduces Qval, a user-friendly and powerful R package that offers a one-stop solution for implementing a wide range of state-of-the-art validation procedures, including parameter estimation, validation methods, iterative procedures, and search algorithms. The Qval package leverages C++ code and parallel computing to improve computational efficiency. Additionally, this paper provides detailed guidance on how to implement Q-matrix validation procedures effectively.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 6","pages":""},"PeriodicalIF":3.9,"publicationDate":"2026-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147833059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}