{"title":"Computerized continuous scoring of the cognitive style figure test: Embedded figure test as an example.","authors":"Meng Ye, Jingyi Li","doi":"10.3758/s13428-024-02559-1","DOIUrl":"10.3758/s13428-024-02559-1","url":null,"abstract":"<p><p>Extensive research has shown that cognitive style is a non-negligible potential influencer of domains of human functioning, such as learning, creativity, and cooperation among individuals. However, the dichotomy of cognitive style is contradictory to the fact that cognitive style is a continuous variable, and the dichotomy loses information about the strength of people's performance between the poles of cognitive style. To solve this problem, this study developed a computerized continuous scoring system (CCS) based on Python's OpenCV library, and achieved continuous scoring of the test of cognitive style, with the Embedded Figure Test as an example. An empirical study was implemented to compare the performance of dichotomous scoring and CCS. The results show that CCS can accurately extract the traces of participants' responses and achieve continuous scoring, supplementing the information on the strength of people's cognitive styles between the two poles, and the performance of CCS-based tests such as discrimination, reliability, and validity are significantly improved compared with the dichotomous scoring. Given the high reproducibility of CCS, it is expected to be applied to scoring other continuity characteristics in the future.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"84"},"PeriodicalIF":4.6,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introducing the Sisu Voice Matching Test (SVMT): A novel tool for assessing voice discrimination in Chinese.","authors":"Tianze Xu, Xiaoming Jiang, Peng Zhang, Anni Wang","doi":"10.3758/s13428-025-02608-3","DOIUrl":"10.3758/s13428-025-02608-3","url":null,"abstract":"<p><p>Existing standardized tests for voice discrimination are based mainly on Indo-European languages, particularly English. However, voice identity perception is influenced by language familiarity, with listeners generally performing better in their native language than in a foreign one. To provide a more accurate and comprehensive assessment of voice discrimination, it is crucial to develop tests tailored to the native language of the test takers. In response, we developed the Sisu Voice Matching Test (SVMT), a pioneering tool designed specifically for Mandarin Chinese speakers. The SVMT was designed to model real-world communication since it includes both pseudo-word and pseudo-sentence stimuli and covers both the ability to categorize identical voices as the same and the ability to categorize distinct voices as different. Built on a neurally validated voice-space model and item response theory, the SVMT ensures high reliability, validity, appropriate difficulty, and strong discriminative power, while maintaining a concise test duration of approximately 10 min. Therefore, by taking into account the effects of language nativeness, the SVMT complements existing voice tests based on other languages' phonologies to provide a more accurate assessment of voice discrimination ability for Mandarin Chinese speakers. Future research can use the SVMT to deepen our understanding of the mechanisms underlying human voice identity perception, especially in special populations, and to examining the relationship between voice identity recognition and other cognitive processes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"86"},"PeriodicalIF":4.6,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Cross-validation and predictive metrics in psychological research: Do not leave out the leave-one-out.","authors":"Diego Iglesias, Miguel A Sorrel, Ricardo Olmos","doi":"10.3758/s13428-024-02588-w","DOIUrl":"10.3758/s13428-024-02588-w","url":null,"abstract":"<p><p>There is growing interest in integrating explanatory and predictive research practices in psychological research. For this integration to be successful, the psychologist's toolkit must incorporate standard procedures that enable a direct estimation of the prediction error, such as cross-validation (CV). Despite their apparent simplicity, CV methods are intricate, and thus it is crucial to adapt them to specific contexts and predictive metrics. This study delves into the performance of different CV methods in estimating the prediction error in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> and <math><mtext>MSE</mtext></math> metrics in regression analysis, ubiquitous in psychological research. Current approaches, which rely on the 5- or 10-fold rule of thumb or on the squared correlation between predicted and observed values, present limitations when computing the prediction error in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> metric, a widely used statistic in the behavioral sciences. We propose the use of an alternative method that overcomes these limitations and enables the computation of the leave-one-out (LOO) in the <math> <msup><mrow><mi>R</mi></mrow> <mn>2</mn></msup> </math> metric. Through two Monte Carlo simulation studies and the application of CV to the data from the Many Labs Replication Project, we show that the LOO consistently has the best performance. The CV methods discussed in the present study have been implemented in the R package OutR2.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"85"},"PeriodicalIF":4.6,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examination of nonlinear longitudinal processes with latent variables, latent processes, latent changes, and latent classes in the structural equation modeling framework: The R package nlpsem.","authors":"Jin Liu","doi":"10.3758/s13428-025-02596-4","DOIUrl":"10.3758/s13428-025-02596-4","url":null,"abstract":"<p><p>We introduce the R package nlpsem (Liu, 2023), a comprehensive toolkit for analyzing longitudinal processes within the structural equation modeling (SEM) framework, incorporating individual measurement occasions. This package emphasizes nonlinear longitudinal models, especially intrinsic ones, across four key scenarios: (1) univariate longitudinal processes with latent variables, optionally including covariates such as time-invariant covariates (TICs) and time-varying covariates (TVCs); (2) multivariate longitudinal analyses to explore correlations or unidirectional relationships between longitudinal variables; (3) multiple-group frameworks for comparing manifest classes in scenarios (1) and (2); and (4) mixture models for scenarios (1) and (2), accommodating latent class heterogeneity. Built on the OpenMx R package, nlpsem supports flexible model designs and uses the full information maximum likelihood method for parameter estimation. A notable feature is its algorithm for determining initial values directly from raw data, improving computational efficiency and convergence. Furthermore, nlpsem provides tools for goodness-of-fit tests, cluster analyses, visualization, derivation of p values and three types of confidence intervals, as well as model selection for nested models using likelihood-ratio tests and for non-nested models based on criteria such as Akaike information criterion and Bayesian information criterion. This article serves as a companion document to the nlpsem R package, providing a comprehensive guide to its modeling capabilities, estimation methods, implementation features, and application examples using synthetic intelligence growth data.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"87"},"PeriodicalIF":4.6,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Mandarin Chinese auditory emotions stimulus database: A validated corpus of monosyllabic Chinese characters.","authors":"Mengyuan Li, Na Li, Anqi Zhou, Huiru Yan, Qiuhong Li, Chifen Ma, Chao Wu","doi":"10.3758/s13428-025-02607-4","DOIUrl":"10.3758/s13428-025-02607-4","url":null,"abstract":"<p><p>Auditory emotional rhythm can be transmitted by simple syllables. This study aimed to establish and validate an auditory speech dataset containing Mandarin Chinese auditory emotional monosyllables (MCAE-Monosyllable), a resource that has not been previously available. A total of 422 Chinese monosyllables were recorded by six professional Mandarin actors, each expressing seven emotions: neutral, happy, angry, sad, fearful, disgusted, and surprised. Additionally, each neutral voice was recorded in four Chinese tones. After standardization and energy balance, the recordings were evaluated by 720 Chinese college students for emotional categories (forced to choose one out of seven emotions) and emotional intensity (rated on a scale of 1-9). The final dataset consists of 18,089 valid Chinese monosyllabic pronunciations (neutrality: 9425, sadness: 2453, anger: 2024; surprise: 1699, disgust: 1624, happiness: 590, fear: 274). On average, neutrality had the highest accuracy rate (79%), followed by anger (75%) and sadness (75%), surprise (74%), happiness (73%), disgust (72%), and finally fear (67%). We provided detailed validation results, acoustic information, and perceptual intensity rating values for each sound. The MCAE-Monosyllable database serves as a valuable resource for neural decoding of Chinese emotional speech, cross-cultural language research, and behavioral or clinical studies related to language and emotional disorders. The database can be obtained within the Open Science Framework ( https://osf.io/h3uem/?view_only=047dfd08dbb64ad0882410da340aa271 ).</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 3","pages":"89"},"PeriodicalIF":4.6,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Charlotte R Pennington, Kayley Birch-Hurst, Matthew Ploszajski, Kait Clark, Craig Hedge, Daniel J Shaw
{"title":"Are we capturing individual differences? Evaluating the test-retest reliability of experimental tasks used to measure social cognitive abilities.","authors":"Charlotte R Pennington, Kayley Birch-Hurst, Matthew Ploszajski, Kait Clark, Craig Hedge, Daniel J Shaw","doi":"10.3758/s13428-025-02606-5","DOIUrl":"10.3758/s13428-025-02606-5","url":null,"abstract":"<p><p>Social cognitive skills are crucial for positive interpersonal relationships, health, and wellbeing and encompass both automatic and reflexive processes. To assess this myriad of skills, researchers have developed numerous experimental tasks that measure automatic imitation, emotion recognition, empathy, perspective taking, and intergroup bias and have used these to reveal important individual differences in social cognition. However, the very reason these tasks produce robust experimental effects - low between-participant variability - can make their use as correlational tools problematic. We performed an evaluation of test-retest reliability for common experimental tasks that measure social cognition. One-hundred and fifty participants completed the race-Implicit Association Test (r-IAT), Stimulus-Response Compatibility (SRC) task, Emotional Go/No-Go (eGNG) task, Dot Perspective-Taking (DPT) task, and State Affective Empathy (SAE) task, as well as the Interpersonal Reactivity Index (IRI) and indices of Explicit Bias (EB) across two sessions within 3 weeks. Estimates of test-retest reliability varied considerably between tasks and their indices: the eGNG task had good reliability (ICC = 0.63-0.69); the SAE task had moderate-to-good reliability (ICC = 0.56-0.77); the r-IAT had moderate reliability (ICC = 0.49); the DPT task had poor-to-good reliability (ICC = 0.24-0.60); and the SRC task had poor reliability (ICC = 0.09-0.29). The IRI had good-to-excellent reliability (ICC = 0.76-0.83) and EB had good reliability (ICC = 0.70-0.77). Experimental tasks of social cognition are used routinely to assess individual differences, but their suitability for this is rarely evaluated. Researchers investigating individual differences must assess the test-retest reliability of their measures.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"82"},"PeriodicalIF":4.6,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11785611/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143073572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Emotional valence, cloze probability, and entropy: Completion norms for 403 French sentences.","authors":"Jérémy Brunel, Emilie Dujardin, Sandrine Delord, Stéphanie Mathey","doi":"10.3758/s13428-025-02604-7","DOIUrl":"10.3758/s13428-025-02604-7","url":null,"abstract":"<p><p>Sentence-final completion norms are a useful way to select materials in the study of psycholinguistics, neurosciences, and language processing. In recent decades, the literature has focused on measuring cloze probability and sentence constraint indexes to account for various contextual expectation effects. However, the emotional content of target words is another factor that may affect word prediction and has not yet been examined. The purpose of the present study was to design a French corpus of sentence completion norms for final words varying in both valence and arousal. A total of 1322 young adults participated in an online written cloze procedure, in which they were asked to guess the final missing word in given sentences. At least 275 individuals evaluated each sentence. Cloze probability index was estimated for each sentence ending with a negative, neutral or positive word, as well as the level of sentence uncertainty through the calculation of sentence entropy. We also estimated the emotionality of the beginning of each sentence as complementary information with valence and arousal values of sentence-ending words. The final corpus of 403 French sentences offers a wide range of cloze predictability contexts for all emotional categories of final words. We hope that these norms may help to implement new research investigating the interplay between language and emotional processing. The collected data and norms are accessible through the Open Science Framework at the following depository link: https://osf.io/7pc46/?view_only=a1ec1c23e28a45b9951c7cecc073e1ac.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"81"},"PeriodicalIF":4.6,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143073575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mareike Westfal, Emiel Cracco, Jan Crusius, Oliver Genschow
{"title":"Validation of an online imitation-inhibition task.","authors":"Mareike Westfal, Emiel Cracco, Jan Crusius, Oliver Genschow","doi":"10.3758/s13428-024-02557-3","DOIUrl":"10.3758/s13428-024-02557-3","url":null,"abstract":"<p><p>People automatically imitate a wide range of different behaviors. One of the most commonly used measurement methods to assess imitative behavior is the imitation-inhibition task (Brass et al., 2000). A disadvantage of its original form is, however, that it was validated for laboratory settings-a time-consuming and costly procedure. Here, we present an approach for conducting the imitation-inhibition task in online settings. We programmed the online version of the imitation-inhibition task in JavaScript and implemented the task in online survey software (i.e., Qualtrics). We validated the task in four experiments. Experiment 1 (N = 88) showed that the typical automatic imitation effects can be detected with good psychometric properties. Going one step further, Experiment 2 (N = 182) directly compared the online version of the imitation-inhibition task with its laboratory version and demonstrated that the online version produces similar strong and reliable effects. In Experiments 3 and 4, we assessed typical moderator effects that were previously reported in laboratory settings: Experiment 3 (N = 93) demonstrated that automatic imitation can be reliably detected in online settings even when controlling for spatial compatibility. Experiment 4 (N = 104) found, in line with previous research, that individuals imitate hand movements executed by a robot less strongly than movements executed by a human. Taken together, the results show that the online version of the imitation-inhibition task offers an easy-to-use method that enables the measurement of automatic imitation with common online survey software tools in a reliable and valid fashion.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"80"},"PeriodicalIF":4.6,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11782408/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A pre-rule for the sequential probability ratio test in a between-item grid multidimensional computerized classification test.","authors":"Po-Hsien Hu, Ching-Lin Shih, Cheng-Te Chen","doi":"10.3758/s13428-025-02600-x","DOIUrl":"10.3758/s13428-025-02600-x","url":null,"abstract":"<p><p>The measurement efficiency of a grid multidimensional computerized classification test (grid MCCT), which makes a classification decision per dimension, can be improved by taking the correlations between the dimensions into account in the termination criterion. The higher the correlations, the better the improvement in measurement efficiency. However, a termination criterion utilizing inter-dimensional information (i.e., SPRT-C; Liu et al., 2022) was found to yield lower levels of correct classification rates than not utilizing it (i.e., SPRT-SF; Seitz & Frey, 2013) under the between-item grid MCCT when the cutoff was set at the mean of the latent trait distribution. This study proposes a pre-rule to determine whether the SPRT-SF or SPRT-C should be used during the process of classification test administration. Through a series of simulation studies, the results showed that our proposed method (called P-SPRT) can substantially improve upon the SPRT-C in terms of correct classification rates, while maintaining its high measurement efficiency in terms of test length. This paper concludes with a discussion of the findings and further applications.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"79"},"PeriodicalIF":4.6,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11779759/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143063490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alfonso Semeraro, Salvatore Vilella, Riccardo Improta, Edoardo Sebastiano De Duro, Saif M Mohammad, Giancarlo Ruffo, Massimo Stella
{"title":"EmoAtlas: An emotional network analyzer of texts that merges psychological lexicons, artificial intelligence, and network science.","authors":"Alfonso Semeraro, Salvatore Vilella, Riccardo Improta, Edoardo Sebastiano De Duro, Saif M Mohammad, Giancarlo Ruffo, Massimo Stella","doi":"10.3758/s13428-024-02553-7","DOIUrl":"10.3758/s13428-024-02553-7","url":null,"abstract":"<p><p>We introduce EmoAtlas, a computational library/framework extracting emotions and syntactic/semantic word associations from texts. EmoAtlas combines interpretable artificial intelligence (AI) for syntactic parsing in 18 languages and psychologically validated lexicons for detecting the eight emotions in Plutchik's theory. We show that EmoAtlas can match or surpass transformer-based natural language processing techniques, BERT or large language models like ChatGPT 3.5 or LLaMAntino, in detecting emotions from Italian and English online posts and news articles (e.g., achieving 85.6 <math><mo>%</mo></math> accuracy in detecting anger in posts vs the 68.8 <math><mo>%</mo></math> value of ChatGPT and 89.9% value for BERT). EmoAtlas presents important advantages in terms of speed and absence of fine-tuning, e.g., it runs 12x faster than BERT on the same data. Testing EmoAtlas' and easily trainable transformers' relevance in a psychometric task like reproducing human creativity ratings for 1071 short texts, we find that EmoAtlas and BERT obtain equivalent predictive power (fourfold cross-validation, <math><mrow><mi>ρ</mi> <mo>≈</mo> <mn>0.495</mn></mrow> </math> , <math><mrow><mi>p</mi> <mo><</mo> <msup><mn>10</mn> <mrow><mo>-</mo> <mn>4</mn></mrow> </msup> </mrow> </math> ). Combining BERT's semantic features with EmoAtlas' emotional/syntactic networks of words gets substantially better at estimating creativity rates of stories ( <math><mrow><mi>ρ</mi> <mo>=</mo> <mn>0.628</mn></mrow> </math> , <math><mrow><mi>p</mi> <mo><</mo> <msup><mn>10</mn> <mrow><mo>-</mo> <mn>4</mn></mrow> </msup> </mrow> </math> ). This indicates an interplay between the creativity of narratives and their semantic, emotional, and syntactic structure. Via interpretable emotional profiles and syntactic networks, EmoAtlas can also quantify how emotions are channeled through specific words in texts, e.g., how did customers frame their ideas and emotions towards \"beds\" in hotel reviews? We release EmoAtlas as a standalone \"text as data\" computational tool and discuss its impact in extracting interpretable and reproducible insights from texts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 2","pages":"77"},"PeriodicalIF":4.6,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}