Jose Manuel Rivera Espejo, Sven De Maeyer, Steven Gillis
{"title":"Everything, altogether, all at once: Addressing data challenges when measuring speech intelligibility through entropy scores.","authors":"Jose Manuel Rivera Espejo, Sven De Maeyer, Steven Gillis","doi":"10.3758/s13428-024-02457-6","DOIUrl":"10.3758/s13428-024-02457-6","url":null,"abstract":"<p><p>When investigating unobservable, complex traits, data collection and aggregation processes can introduce distinctive features to the data such as boundedness, measurement error, clustering, outliers, and heteroscedasticity. Failure to collectively address these features can result in statistical challenges that prevent the investigation of hypotheses regarding these traits. This study aimed to demonstrate the efficacy of the Bayesian beta-proportion generalized linear latent and mixed model (beta-proportion GLLAMM) (Rabe-Hesketh et al., Psychometrika, 69(2), 167-90, 2004a, Journal of Econometrics, 128(2), 301-23, 2004c, 2004b; Skrondal and Rabe-Hesketh 2004) in handling data features when exploring research hypotheses concerning speech intelligibility. To achieve this objective, the study reexamined data from transcriptions of spontaneous speech samples initially collected by Boonen et al. (Journal of Child Language, 50(1), 78-103, 2023). The data were aggregated into entropy scores. The research compared the prediction accuracy of the beta-proportion GLLAMM with the normal linear mixed model (LMM) (Holmes et al., 2019) and investigated its capacity to estimate a latent intelligibility from entropy scores. The study also illustrated how hypotheses concerning the impact of speaker-related factors on intelligibility can be explored with the proposed model. The beta-proportion GLLAMM was not free of challenges; its implementation required formulating assumptions about the data-generating process and knowledge of probabilistic programming languages, both central to Bayesian methods. Nevertheless, results indicated the superiority of the model in predicting empirical phenomena over the normal LMM, and its ability to quantify a latent potential intelligibility. Additionally, the proposed model facilitated the exploration of hypotheses concerning speaker-related factors and intelligibility. Ultimately, this research has implications for researchers and data analysts interested in quantitatively measuring intricate, unobservable constructs while accurately predicting the empirical phenomena.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141756839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Debora de Chiusole, Umberto Granziol, Andrea Spoto, Luca Stefanutti
{"title":"Reliability of a probabilistic knowledge structure.","authors":"Debora de Chiusole, Umberto Granziol, Andrea Spoto, Luca Stefanutti","doi":"10.3758/s13428-024-02468-3","DOIUrl":"10.3758/s13428-024-02468-3","url":null,"abstract":"<p><p>Indexes for estimating the overall reliability of a test in the framework of knowledge space theory (KST) are proposed and analyzed. First, the possibility of applying in KST the existing classical test theory (CTT) methods, based on the ratio between the true score variance and the total variance of the measure, has been explored. However, these methods are not suitable because in KST error and true score are not independent. Therefore, two new indexes based on the concepts of entropy and conditional entropy are developed. One index is used to estimate the reliability of the response pattern given the knowledge state, while the second one refers to the reliability of the estimated knowledge state of a person. Some theoretical considerations as well as simulations and an empirical example on real data are provided within a study of the behavior of these indexes under a certain number of different conditions.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141765069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Investigating weight constraint methods for causal-formative indicator modeling.","authors":"Ruoxuan Li, Lijuan Wang","doi":"10.3758/s13428-024-02365-9","DOIUrl":"10.3758/s13428-024-02365-9","url":null,"abstract":"<p><p>Causal-formative indicators are often used in social science research. To achieve identification in causal-formative indicator modeling, constraints need to be applied. A conventional method is to constrain the weight of a formative indicator to be 1. The selection of which indicator to have the fixed weight, however, may influence statistical inferences of the structural path coefficients from the causal-formative construct to outcomes. Another conventional method is to use equal weights (e.g., 1) and assumes that all indicators equally contribute to the latent construct, which can be a strong assumption. To address the limitations of the conventional methods, we proposed an alternative constraint method, in which the sum of the weights is constrained to be a constant. We analytically studied the relations and interpretations of structural path coefficients from the constraint methods, and the results showed that the proposed method yields better interpretations of path coefficients. Simulation studies were conducted to compare the performance of the weight constraint methods in causal-formative indicator modeling with one or two outcomes. Results showed that higher biases in the path coefficient estimates were observed from the conventional methods compared to the proposed method. The proposed method had ignorable bias and satisfactory coverage rates in the studied conditions. This study emphasizes the importance of using an appropriate weight constraint method in causal-formative indicator modeling.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefan Schneider, Raymond Hernandez, Doerte U Junghaenel, Haomiao Jin, Pey-Jiuan Lee, Hongxin Gao, Danny Maupin, Bart Orriens, Erik Meijer, Arthur A Stone
{"title":"Can you tell people's cognitive ability level from their response patterns in questionnaires?","authors":"Stefan Schneider, Raymond Hernandez, Doerte U Junghaenel, Haomiao Jin, Pey-Jiuan Lee, Hongxin Gao, Danny Maupin, Bart Orriens, Erik Meijer, Arthur A Stone","doi":"10.3758/s13428-024-02388-2","DOIUrl":"10.3758/s13428-024-02388-2","url":null,"abstract":"<p><p>Questionnaires are ever present in survey research. In this study, we examined whether an indirect indicator of general cognitive ability could be developed based on response patterns in questionnaires. We drew on two established phenomena characterizing connections between cognitive ability and people's performance on basic cognitive tasks, and examined whether they apply to questionnaires responses. (1) The worst performance rule (WPR) states that people's worst performance on multiple sequential tasks is more indicative of their cognitive ability than their average or best performance. (2) The task complexity hypothesis (TCH) suggests that relationships between cognitive ability and performance increase with task complexity. We conceptualized items of a questionnaire as a series of cognitively demanding tasks. A graded response model was used to estimate respondents' performance for each item based on the difference between the observed and model-predicted response (\"response error\" scores). Analyzing data from 102 items (21 questionnaires) collected from a large-scale nationally representative sample of people aged 50+ years, we found robust associations of cognitive ability with a person's largest but not with their smallest response error scores (supporting the WPR), and stronger associations of cognitive ability with response errors for more complex than for less complex questions (supporting the TCH). Results replicated across two independent samples and six assessment waves. A latent variable of response errors estimated for the most complex items correlated .50 with a latent cognitive ability factor, suggesting that response patterns can be utilized to extract a rough indicator of general cognitive ability in survey research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362444/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140288151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl
{"title":"How ready is speech-to-text for psychological language research? Evaluating the validity of AI-generated English transcripts for analyzing free-spoken responses in younger and older adults.","authors":"Valeria A Pfeifer, Trish D Chilton, Matthew D Grilli, Matthias R Mehl","doi":"10.3758/s13428-024-02440-1","DOIUrl":"10.3758/s13428-024-02440-1","url":null,"abstract":"<p><p>For the longest time, the gold standard in preparing spoken language corpora for text analysis in psychology was using human transcription. However, such standard comes at extensive cost, and creates barriers to quantitative spoken language analysis that recent advances in speech-to-text technology could address. The current study quantifies the accuracy of AI-generated transcripts compared to human-corrected transcripts across younger (n = 100) and older (n = 92) adults and two spoken language tasks. Further, it evaluates the validity of Linguistic Inquiry and Word Count (LIWC)-features extracted from these two kinds of transcripts, as well as transcripts specifically prepared for LIWC analyses via tagging. We find that overall, AI-generated transcripts are highly accurate with a word error rate of 2.50% to 3.36%, albeit being slightly less accurate for younger compared to older adults. LIWC features extracted from either transcripts are highly correlated, while the tagging procedure significantly alters filler word categories. Based on these results, automatic speech-to-text appears to be ready for psychological language research when using spoken language tasks in relatively quiet environments, unless filler words are of interest to researchers.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11365748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniele Marinazzo, Jan Van Roozendaal, Fernando E Rosas, Massimo Stella, Renzo Comolatti, Nigel Colenbier, Sebastiano Stramaglia, Yves Rosseel
{"title":"An information-theoretic approach to build hypergraphs in psychometrics.","authors":"Daniele Marinazzo, Jan Van Roozendaal, Fernando E Rosas, Massimo Stella, Renzo Comolatti, Nigel Colenbier, Sebastiano Stramaglia, Yves Rosseel","doi":"10.3758/s13428-024-02471-8","DOIUrl":"10.3758/s13428-024-02471-8","url":null,"abstract":"<p><p>Psychological network approaches propose to see symptoms or questionnaire items as interconnected nodes, with links between them reflecting pairwise statistical dependencies evaluated on cross-sectional, time-series, or panel data. These networks constitute an established methodology to visualise and conceptualise the interactions and relative importance of nodes/indicators, providing an important complement to other approaches such as factor analysis. However, limiting the representation to pairwise relationships can neglect potentially critical information shared by groups of three or more variables (higher-order statistical interdependencies). To overcome this important limitation, here we propose an information-theoretic framework to assess these interdependencies and consequently to use hypergraphs as representations in psychometrics. As edges in hypergraphs are capable of encompassing several nodes together, this extension can thus provide a richer account on the interactions that may exist among sets of psychological variables. Our results show how psychometric hypergraphs can highlight meaningful redundant and synergistic interactions on either simulated or state-of-the-art, re-analysed psychometric datasets. Overall, our framework extends current network approaches while leading to new ways of assessing the data that differ at their core from other methods, enriching the psychometrics toolbox, and opening promising avenues for future investigation.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141854637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A library for innovative category exemplars (ALICE) database: Streamlining research with printable 3D novel objects.","authors":"Alice Xu, Ji Y Son, Catherine M Sandhofer","doi":"10.3758/s13428-024-02458-5","DOIUrl":"10.3758/s13428-024-02458-5","url":null,"abstract":"<p><p>This paper introduces A Library for Innovative Category Exemplars (ALICE) database, a resource that enhances research efficiency in cognitive and developmental studies by providing printable 3D objects representing 30 novel categories. Our research consists of three experiments to validate the novelty and complexity of the objects in ALICE. Experiment 1 assessed the novelty of objects through adult participants' subjective familiarity ratings and agreement on object naming and descriptions. The results confirm the general novelty of the objects. Experiment 2 employed multidimensional scaling (MDS) to analyze perceived similarities between objects, revealing a three-dimensional structure based solely on shape, indicative of their complexity. Experiment 3 used two clustering techniques to categorize objects: k-means clustering for creating nonoverlapping global categories, and hierarchical clustering for allowing global categories that overlap and have a hierarchical structure. Through stability tests, we verified the robustness of each clustering method and observed a moderate to good consensus between them, affirming the strength of our dual approach in effectively and accurately delineating meaningful object categories. By offering easy access to customizable novel stimuli, ALICE provides a practical solution to the challenges of creating novel physical objects for experimental purposes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362262/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141874023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xin Hu, Tanika R Sgherza, Jessie B Nothrup, David M Fresco, Kristin Naragon-Gainey, Lauren M Bylsma
{"title":"From lab to life: Evaluating the reliability and validity of psychophysiological data from wearable devices in laboratory and ambulatory settings.","authors":"Xin Hu, Tanika R Sgherza, Jessie B Nothrup, David M Fresco, Kristin Naragon-Gainey, Lauren M Bylsma","doi":"10.3758/s13428-024-02387-3","DOIUrl":"10.3758/s13428-024-02387-3","url":null,"abstract":"<p><p>Despite the increasing popularity of ambulatory assessment, the reliability and validity of psychophysiological signals from wearable devices is unproven in daily life settings. We evaluated the reliability and validity of physiological signals (electrocardiogram, ECG; photoplethysmography, PPG; electrodermal activity, EDA) collected from two wearable devices (Movisens EcgMove4 and Empatica E4) in the lab (N = 67) and daily life (N = 20) among adults aged 18-64 with Mindware as the laboratory gold standard. Results revealed that both wearable devices' valid data rates in daily life were lower than in the laboratory (Movisens ECG 82.94 vs. 93.10%, Empatica PPG 8.79 vs. 26.14%, and Empatica EDA 41.16 vs. 42.67%, respectively). The poor valid data rates of Empatica PPG signals in the laboratory could be partially attributed to participants' hand movements (r = - .27, p = .03). In laboratory settings, heart rate (HR) derived from both wearable devices exhibited higher concurrent validity than heart rate variability (HRV) metrics (ICCs 0.98-1.00 vs. 0.75-0.97). The number of skin conductance responses (SCRs) derived from Empatica showed higher concurrent validity than skin conductance level (SCL, ICCs 0.38 vs. 0.09). Movisens EcgMove4 provided more reliable and valid HRV measurements than Empatica E4 in both laboratory (split-half reliability: 0.95-0.99 vs. 0.85-0.98; concurrent validity: 0.95-1.00 vs. 0.75-0.98; valid data rate: 93.10 vs. 26.14%) and ambulatory settings (split-half reliability: 0.99-1.00 vs. 0.89-0.98; valid data rate: 82.94 vs. 8.79%). Although the reliability and validity of wearable devices are improving, findings suggest researchers should select devices that yield consistently robust and valid data for their measures of interest.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140288153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jianhua Xiong, Zhaosheng Luo, Guanzhong Luo, Xiaofeng Yu, Yujun Li
{"title":"An exploratory Q-matrix estimation method based on sparse non-negative matrix factorization.","authors":"Jianhua Xiong, Zhaosheng Luo, Guanzhong Luo, Xiaofeng Yu, Yujun Li","doi":"10.3758/s13428-024-02442-z","DOIUrl":"10.3758/s13428-024-02442-z","url":null,"abstract":"<p><p>Cognitive diagnostic assessment (CDA) is widely used because it can provide refined diagnostic information. The Q-matrix is the basis of CDA, and can be specified by domain experts or by data-driven estimation methods based on observed response data. The data-driven Q-matrix estimation methods have become a research hotspot because of their objectivity, accuracy, and low calibration cost. However, most of the existing data-driven methods require known prior knowledge, such as initial Q-matrix, partial q-vector, or the number of attributes. Under the G-DINA model, we propose to estimate the number of attributes and Q-matrix elements simultaneously without any prior knowledge by the sparse non-negative matrix factorization (SNMF) method, which has the advantage of high scalability and universality. Simulation studies are carried out to investigate the performance of the SNMF. The results under a wide variety of simulation conditions indicate that the SNMF has good performance in the accuracy of attribute number and Q-matrix elements estimation. In addition, a set of real data is taken as an example to illustrate its application. Finally, we discuss the limitations of the current study and directions for future research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141765067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexander Stavropoulos, Damien L Crone, Igor Grossmann
{"title":"Shadows of wisdom: Classifying meta-cognitive and morally grounded narrative content via large language models.","authors":"Alexander Stavropoulos, Damien L Crone, Igor Grossmann","doi":"10.3758/s13428-024-02441-0","DOIUrl":"10.3758/s13428-024-02441-0","url":null,"abstract":"<p><p>We investigated large language models' (LLMs) efficacy in classifying complex psychological constructs like intellectual humility, perspective-taking, open-mindedness, and search for a compromise in narratives of 347 Canadian and American adults reflecting on a workplace conflict. Using state-of-the-art models like GPT-4 across few-shot and zero-shot paradigms and RoB-ELoC (RoBERTa -fine-tuned-on-Emotion-with-Logistic-Regression-Classifier), we compared their performance with expert human coders. Results showed robust classification by LLMs, with over 80% agreement and F1 scores above 0.85, and high human-model reliability (Cohen's κ Md across top models = .80). RoB-ELoC and few-shot GPT-4 were standout classifiers, although somewhat less effective in categorizing intellectual humility. We offer example workflows for easy integration into research. Our proof-of-concept findings indicate the viability of both open-source and commercial LLMs in automating the coding of complex constructs, potentially transforming social science research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141173767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}