{"title":"Catching up with iCatcher: Comparing analyses of infant eye tracking based on trained human coders and iCatcher+ automated gaze coding software.","authors":"Elena Luchkina, Leah R Simon, Sandra R Waxman","doi":"10.3758/s13428-025-02683-6","DOIUrl":"10.3758/s13428-025-02683-6","url":null,"abstract":"<p><p>Eye-tracking measures, which provide crucial insight into the processes underlying human language cognition, perception, and social behavior, are particularly important in research with preverbal infants. Until recently, infant eye-gaze analysis required either expensive corneal-reflection eye-tracking technology or labor-intensive manual annotation (coding). Fortunately, iCatcher+, a recently developed AI-based automated gaze annotation tool, promises to reduce these expenses. To adopt this tool as a mainstream tool for gaze annotation, it is key to determine how annotations produced by iCatcher+ compare to the annotations produced by trained human coders. Here, we provide such a comparison, using 288 videos from a word-learning experiment with 12-month-olds. We evaluate the agreement between these two annotation systems and the effects identified using each system. We find that (1) agreement between human-coded and iCatcher+-annotated video data is excellent (88%) and comparable to intercoder agreement among human coders (90%), and (2) both annotation systems yield the same patterns of effects. This provides strong assurances that iCatcher+ is a viable alternative to manual annotation of infant gaze, one that holds promise for increasing efficiency, reducing the costs, and broadening the empirical base in infant eye-tracking.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 6","pages":"158"},"PeriodicalIF":4.6,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143975671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese character size test: Test development, validation, and standards-referenced norms for Chinese primary students.","authors":"Yan Li, Yi Wei, Hong Li","doi":"10.3758/s13428-025-02680-9","DOIUrl":"10.3758/s13428-025-02680-9","url":null,"abstract":"<p><p>Recognizing Chinese characters is a fundamental skill that is crucial for later reading development and basic educational achievement in China. However, there is currently a lack of published tests or norms that provide adequate feedback on children's character size development. To address this gap, we developed the Chinese character size test (CCST) for Mandarin Chinese students in grades 1-6. To address the need for resource-efficient assessment, longitudinal tracking precision, and curriculum-aligned interpretation, we applied rigorous test development processes, vertical test equating, and the item response theory framework to measure children's character size development throughout primary grades. A comprehensive evaluation of the CCST's psychometric properties yielded satisfying results, including item statistics, test reliability, inferential score reliability, criterion-related validity, and empirical validity. Normative data from a representative sample of 7,459 primary school students in Beijing were analyzed to construct the norm-referenced and criterion-referenced character size scores. The results indicated that the mean character sizes of primary students in grades 1-6 were 1,227, 1,898, 2,422, 2,722, 2,932, and 3,060 characters, respectively, and that approximately 5.1% of the students in grades 3-6 failed to achieve the required character size level by the national curriculum criterion. In conclusion, the CCST is a child-friendly, highly interpretable, and open-access instrument with strong psychometric quality and comprehensive scoring feedback. This work would interest a wide range of users, including researchers, educators, and practitioners.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 6","pages":"155"},"PeriodicalIF":4.6,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143959421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Studying the influence of single social interactions on approach and avoidance behavior: A multimodal investigation in immersive virtual reality.","authors":"Sabrina Gado, Matthias Gamer","doi":"10.3758/s13428-025-02627-0","DOIUrl":"10.3758/s13428-025-02627-0","url":null,"abstract":"<p><p>When studying spontaneous or learned emotional responses to social stimuli, research has traditionally relied on simplified stimuli repeatedly presented on a computer screen in standardized laboratory environments. While these studies have provided important insights into social perception and cognition, their restricted ecological validity may impede the extrapolation of findings to everyday social contexts. Here, we developed a novel immersive virtual reality scenario that permits the examination of social approach and avoidance behavior under naturalistic circumstances while at the same time maintaining full experimental control. Using a combination of a social conditioning procedure with a social approach-avoidance test, we conducted two experiments (both with N = 48 female participants) to investigate how individuals differing in trait social anxiety adapt their behavior after a single encounter with an either friendly or unfriendly virtual agent. In addition to overt approach and avoidance behavior, we acquired subjective ratings, eye-tracking data, and autonomic responses. Overall, we observed significant effects of the social conditioning procedure on autonomic responses and participants' exploration behavior. After initially increased attention, participants exhibited avoidance of social threats as indicated by a higher interpersonal distance and decreased visual attention towards the negatively conditioned virtual agent in the test phase. We found no association between hypervigilance and trait social anxiety but observed higher fear ratings and enhanced avoidance of social threats in participants with elevated anxiety levels. Altogether, this study demonstrates the potential of immersive virtual environments for examining social learning processes under conditions resembling real-life social encounters.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 6","pages":"157"},"PeriodicalIF":4.6,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12031922/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143962401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lexical decision times for nouns from the Croatian Psycholinguistic Database.","authors":"Denis Vlašiček, Francesca Dumančić, Mirjana Tonković","doi":"10.3758/s13428-025-02676-5","DOIUrl":"10.3758/s13428-025-02676-5","url":null,"abstract":"<p><p>Megastudies are one of the tools that researchers in psychology and linguistics use when investigating various language phenomena. This study presents an analysis of the association between word length, subjective frequency, concreteness and age of acquisition ratings, and reaction times in a visual lexical decision task. Using a megastudy paradigm, we collected data on 2614 Croatian nouns from a total of 92 participants. The results of the analyses conducted on the dataset mostly confirm the findings of previous research - shorter lexical decision times are associated with more frequent words, shorter words, and words acquired earlier in life. Findings on the association between lexical decision times and word concreteness are mixed. Both reaction-level and aggregate data related to the study have been made freely available to the public. The datasets contain lexical decision times as well as variables available from the Croatian Psycholinguistic Database.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 6","pages":"156"},"PeriodicalIF":4.6,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143961435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The keyboards are (still) all right in response time experiments.","authors":"Pablo Gómez, Manuel Perea, Ana Baciero","doi":"10.3758/s13428-025-02637-y","DOIUrl":"10.3758/s13428-025-02637-y","url":null,"abstract":"<p><p>Response times (RTs) are a ubiquitous variable for assessing cognitive and motor processes. However, variability introduced by keyboards, especially in online experiments, has raised concerns among behavioral researchers. Here, we evaluate the impact of keyboard delays on RT measurements using linear mixed-effects models and grouped data t-tests through a series of simulations. The results showed that the impact of keyboard delays on statistical power is minimal in most cases. Keyboard-induced variability does not inflate type I error rates and has a negligible impact on power, except in rare scenarios of RT distribution shifts or in studies focused on individual differences with low signal-to-noise ratios. Thus, commercially available keyboards remain suitable for most RT experiments, including those conducted online.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"154"},"PeriodicalIF":4.6,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143973304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"English verbs semantic norms database: Concreteness, embodiment, imageability, valence and arousal ratings for 3,500 verbs.","authors":"Emiko J Muraki, Penny M Pexman","doi":"10.3758/s13428-025-02675-6","DOIUrl":"10.3758/s13428-025-02675-6","url":null,"abstract":"<p><p>Semantic ratings studies have resulted in significant methodological advances towards understanding the importance of experiential information to lexical-semantic processing. Yet, the existing norms are biased towards nouns, with fewer ratings available for verbs. In the present study, we collected new semantic rating norms for 3,512 verbs on the dimensions of concreteness, embodiment, imageability, valence, and arousal. The resulting ratings provide the largest database of verb-specific rating norms across multiple semantic dimensions. They show good reliability and validity on four of five dimensions, with some evident challenges in rating arousal for the verb stimuli. We demonstrate that the norms account for variance in response latencies and accuracy in a lexical decision task, word recognition task, and recognition memory, above and beyond dimensions such as length, frequency, orthographic Levenshtein distance, and age of acquisition, and thus that semantic richness effects are observed in verb processing. The norms described here should be a useful resource for researchers interested in verb lexical-semantic processing.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"153"},"PeriodicalIF":4.6,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143972794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dictation and vocabulary knowledge tests for adult native Chinese readers.","authors":"Yiu-Kei Tsang","doi":"10.3758/s13428-025-02669-4","DOIUrl":"10.3758/s13428-025-02669-4","url":null,"abstract":"<p><p>To examine how individual differences in language skills affect language processing, it is essential to have good-quality tests that can assess such individual differences accurately. This study introduces a dictation test and a vocabulary knowledge test in Chinese, which aim to measure lexical expertise in proficient Chinese language users like university students. The psychometric properties of the two tests were examined with two groups of participants. In the first group, exploratory factor analyses confirmed that each of these tests was unidimensional, measuring a single underlying construct of lexical expertise. After removing some problematic items, the two tests also demonstrated satisfactory internal reliabilities. Although the test scores were only weakly correlated with self-reported measures of language proficiency, the correlation with word recognition performance was moderate. These results were successfully replicated with the second cross-validation group, confirming the reliability and convergent validity of the tests. An additional dataset further showed that the vocabulary test score was positively correlated with sentence comprehension performance. Taken together, the tests have acceptable psychometric quality and can serve as tools for examining individual differences in Chinese language processing. The tests are freely available online, and normative performance data are provided, facilitating their use in future research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"151"},"PeriodicalIF":4.6,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12014802/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143955743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing multiple abilities through process data in computer-based assessments: The multidimensional sequential response model (MSRM).","authors":"Yuting Han, Feng Ji, Pujue Wang, Hongyun Liu","doi":"10.3758/s13428-025-02658-7","DOIUrl":"10.3758/s13428-025-02658-7","url":null,"abstract":"<p><p>With the advent of computer-based assessment (CBA), process data have assumed an increasingly pivotal role in estimating examinees' latent abilities by capturing detailed records of their response processes. This study introduces the Multidimensional sequential response model (MSRM), a novel model for assessing multiple abilities through process data in computer-based cognitive and psychological assessments. A Bayesian estimation method for the MSRM is proposed and examined through a Monte Carlo simulation study across varying conditions. The results suggest that the MSRM's parameter estimation demonstrates adequate accuracy and computational efficiency, with estimation quality improving as sample sizes and sequence lengths increase. We demonstrate the practical utility of MSRM through two empirical studies, showing that it can be effectively applied in various contexts. This methodology provides valuable insights for tailored instruction by offering detailed assessments of ability mastery across multiple dimensions, thereby supporting more targeted educational interventions.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"152"},"PeriodicalIF":4.6,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143963812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Csenge G Horváth, Bence Schneider, Borbála Rozner, Míra Koczur, Róbert Bódizs
{"title":"Interrelationships between sleep quality, circadian phase and rapid eye movement sleep: Deriving chronotype from sleep architecture.","authors":"Csenge G Horváth, Bence Schneider, Borbála Rozner, Míra Koczur, Róbert Bódizs","doi":"10.3758/s13428-025-02671-w","DOIUrl":"10.3758/s13428-025-02671-w","url":null,"abstract":"<p><p>The relationship between sleep quality, circadian rhythms, and REM sleep has not been deliberately investigated in previous scientific reports. Here, we aim to examine the associations between these factors by specifically focusing on the temporal dynamics of REM sleep in all night records, as well as to provide a new, objective, EEG-derived chronotype indicator. To achieve those aims, a wearable EEG headband recorded home sleep database was analyzed in terms of total sleep time (TST), REM dynamics, core body temperature, wrist actigraphy, Munich Chronotype Questionnaire, Pittsburgh Sleep Quality Index, subjective morning sleep quality, and Likert Sleepiness Scale. Furthermore, records from the Budapest-Munich database of polysomnography (PSG) were analyzed for REM sleep patterns, TST, arousal dynamics, and age. The results show that the timing of the crest of REM propensity (REM<sub>maxprop</sub>) reliably correlated with weekly average actigraphy sleep midpoints, subjective chronotype measures, and also tended to be associated with core body temperature. Additionally, REM<sub>maxprop</sub> emerged at earlier times in children and middle-aged participants as compared to teenagers and young adults. Subjective sleep quality exclusively reflected the shortening of headband-recorded sleep as compared to weekly average TST. REM percent negatively correlated with NREM arousal density. It can be concluded that the overnight REM sleep dynamic (REM<sub>maxprop</sub>) is a putative indicator of circadian phase/chronotype with potential relevance for home sleep studies. However, sleep quality indices are less conclusive in between-subjects design, urging the need for longitudinal investigations allowing interindividual analyses.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"150"},"PeriodicalIF":4.6,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12011970/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143961190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"When raters generalize: Examining sources of halo effects with mixture Rasch facets models.","authors":"Kuan-Yu Jin, Thomas Eckes","doi":"10.3758/s13428-025-02667-6","DOIUrl":"10.3758/s13428-025-02667-6","url":null,"abstract":"<p><p>Halo effects are commonly considered a cognitive or judgmental bias leading to rating error when raters assign scores to persons or performances on multiple criteria. Though a long tradition of research has pointed to possible sources of halo effects, measurement models for identifying these sources and detecting halo have been lacking. In the present research, we propose a general mixture Rasch facets model for halo effects (MRFM-H) and derive two more specific models, each assuming a different psychological mechanism. According to the first model, MRFM-H(GI), persons evoke general impressions that guide raters when assigning scores on conceptually distinct criteria. The second model, MRFM-H(ID), assumes that raters fail to discriminate adequately between the criteria. We adopted a Bayesian inference approach to implement these models, conducting two simulation studies and a real-data analysis. In the simulation studies, we found that (a) the number of raters and criteria determined the accuracy of classifying persons as inducing or not inducing halo; (b) 90% classification accuracy was achieved when at least 25 ratings were available for each rater-person combination; (c) ignoring halo caused by either mechanism (general impressions or inadequate criterion discrimination) biased the criterion parameter estimates while having a negligible impact on person and rater estimates; (d) Bayesian data-model fit statistics (WAIC and WBIC) reliably identified the true, data-generating model. The real-data analysis highlighted the models' practical utility for examining the likely source of halo effects. The discussion focuses on the models' application in various assessment contexts and points to directions for future research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"149"},"PeriodicalIF":4.6,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143955772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}