Educational and Psychological Measurement最新文献

筛选
英文 中文
Using Biclustering to Detect Cheating in Real Time on Mixed-Format Tests. 用双聚类实时检测混合格式考试作弊。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-05-24 DOI: 10.1177/00131644251333143
Hyeryung Lee, Walter P Vispoel
{"title":"Using Biclustering to Detect Cheating in Real Time on Mixed-Format Tests.","authors":"Hyeryung Lee, Walter P Vispoel","doi":"10.1177/00131644251333143","DOIUrl":"https://doi.org/10.1177/00131644251333143","url":null,"abstract":"<p><p>We evaluated a real-time biclustering method for detecting cheating on mixed-format assessments that included dichotomous, polytomous, and multi-part items. Biclustering jointly groups examinees and items by identifying subgroups of test takers who exhibit similar response patterns on specific subsets of items. This method's flexibility and minimal assumptions about examinee behavior make it computationally efficient and highly adaptable. To further finetune accuracy and reduce false positives in real-time detection, enhanced statistical significance tests were incorporated into the illustrated algorithms. Two simulation studies were conducted to assess detection across varying testing conditions. In the first study, the method effectively detected cheating on tests composed entirely of either dichotomous or non-dichotomous items. In the second study, we examined tests with varying mixed item formats and again observed strong detection performance. In both studies, detection performance was examined at each timestamp in real time and evaluated under three varying conditions: proportion of cheaters, cheating group size, and proportion of compromised items. Across conditions, the method demonstrated strong computational efficiency, underscoring its suitability for real-time applications. Overall, these results highlight the adaptability, versatility, and effectiveness of biclustering in detecting cheating in real time while maintaining low false-positive rates.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251333143"},"PeriodicalIF":2.1,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12104213/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144156794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Deep Reinforcement Learning to Decide Test Length. 使用深度强化学习来决定测试长度。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-05-03 DOI: 10.1177/00131644251332972
James Zoucha, Igor Himelfarb, Nai-En Tang
{"title":"Using Deep Reinforcement Learning to Decide Test Length.","authors":"James Zoucha, Igor Himelfarb, Nai-En Tang","doi":"10.1177/00131644251332972","DOIUrl":"https://doi.org/10.1177/00131644251332972","url":null,"abstract":"<p><p>This study explored the application of deep reinforcement learning (DRL) as an innovative approach to optimize test length. The primary focus was to evaluate whether the current length of the National Board of Chiropractic Examiners Part I Exam is justified. By modeling the problem as a combinatorial optimization task within a Markov Decision Process framework, an algorithm capable of constructing test forms from a finite set of items while adhering to critical structural constraints, such as content representation and item difficulty distribution, was used. The findings reveal that although the DRL algorithm was successful in identifying shorter test forms that maintained comparable ability estimation accuracy, the existing test length of 240 items remains advisable as we found shorter test forms did not maintain structural constraints. Furthermore, the study highlighted the inherent adaptability of DRL to continuously learn about a test-taker's latent abilities and dynamically adjust to their response patterns, making it well-suited for personalized testing environments. This dynamic capability supports real-time decision-making in item selection, improving both efficiency and precision in ability estimation. Future research is encouraged to focus on expanding the item bank and leveraging advanced computational resources to enhance the algorithm's search capacity for shorter, structurally compliant test forms.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251332972"},"PeriodicalIF":2.1,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12049363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143988676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Change in Adjusted R-Square and R-Square Indices: A Latent Variable Method Application. 评价调整后r方和r方指数的变化:一种潜在变量法的应用。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-04-11 DOI: 10.1177/00131644251329178
Tenko Raykov, Christine DiStefano
{"title":"Evaluating Change in Adjusted <i>R</i>-Square and <i>R</i>-Square Indices: A Latent Variable Method Application.","authors":"Tenko Raykov, Christine DiStefano","doi":"10.1177/00131644251329178","DOIUrl":"https://doi.org/10.1177/00131644251329178","url":null,"abstract":"<p><p>A procedure for interval estimation of the difference in the adjusted <i>R</i>-square index for nested linear models is discussed. The method yields as a byproduct confidence intervals for their standard <i>R</i>-square difference, as well as for the adjusted and standard <i>R</i>-squares associated with each model. The resulting interval estimate of the difference in adjusted <i>R</i>-square represents a useful and informative complement to the commonly used <i>R</i>-square change statistic and its significance test in model selection and contains substantially more information than that test. The outlined procedure is readily employed with popular software in empirical educational and psychological studies and is illustrated with numerical data.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251329178"},"PeriodicalIF":2.1,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143985479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Performance of Strategies for Handling Rapid Guessing Responses in Item Response Theory Equating. 项目反应理论等价中处理快速猜测反应策略的性能评估。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-03-30 DOI: 10.1177/00131644251329524
Juyoung Jung, Won-Chan Lee
{"title":"Assessing the Performance of Strategies for Handling Rapid Guessing Responses in Item Response Theory Equating.","authors":"Juyoung Jung, Won-Chan Lee","doi":"10.1177/00131644251329524","DOIUrl":"10.1177/00131644251329524","url":null,"abstract":"<p><p>This study assesses the performance of strategies for handling rapid guessing responses (RGs) within the context of item response theory observed-score equating. Four distinct approaches were evaluated: (1) ignoring RGs, (2) penalizing RGs as incorrect responses, (3) implementing list-wise deletion (LWD), and (4) treating RGs as missing data followed by imputation using logistic regression-based methodologies. These strategies were examined across a diverse array of testing scenarios. Results indicate that the performance of each strategy varied depending on the specific manipulated factors. Both ignoring and penalizing RGs were found to introduce substantial distortions in equating accuracy. LWD generally exhibited the lowest bias among the strategies evaluated but showed higher standard errors. Data imputation methods, particularly those employing lasso logistic regression and bootstrap techniques, demonstrated superior performance in minimizing equating errors compared to other approaches.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251329524"},"PeriodicalIF":2.1,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11955993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143763405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the Properties and Functioning of Model-Based Sum Scores in Multidimensional Measures With Local Item Dependencies: A Comprehensive Proposal. 评估局部项目依赖多维度量中基于模型的总和分数的性质和功能:一个综合建议。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-03-13 DOI: 10.1177/00131644251319286
Pere J Ferrando, David Navarro-González, Fabia Morales-Vives
{"title":"Assessing the Properties and Functioning of Model-Based Sum Scores in Multidimensional Measures With Local Item Dependencies: A Comprehensive Proposal.","authors":"Pere J Ferrando, David Navarro-González, Fabia Morales-Vives","doi":"10.1177/00131644251319286","DOIUrl":"https://doi.org/10.1177/00131644251319286","url":null,"abstract":"<p><p>A common problem in the assessment of noncognitive attributes is the presence of items with correlated residuals. Although most studies have focused on their effect at the structural level, they may also have an effect on the accuracy and effectiveness of the scores derived from extended factor analytic (FA) solutions which include correlated residuals. For this reason, several measures of reliability/factor saturation and information were developed in a previous study to assess this effect in sum scores derived from unidimensional measures based on both linear and nonlinear FA solutions. The current article extends these proposals to a second-order solution with a single general factor, and it also extends the added-value principle to the second-order scenario when local dependences are operating. Related to the added-value, a new coefficient is developed (an effect-size index and its confidence intervals). Overall, what is proposed allows first to assess the reliability and relative efficiency of the scores at both the subscale and total scale levels, and second, provides information on the appropriateness of using subscale scores to predict their own factor in comparison to the predictive capacity of the total score. All that is proposed is implemented in a freely available R program. Its usefulness is illustrated with an empirical example, which shows the distortions that correlated residuals may cause and how the various measures included in this proposal should be interpreted.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251319286"},"PeriodicalIF":2.1,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11907499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143647648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Shortening Psychological Scales: Semantic Similarity Matters. 缩短心理量表:语义相似性问题。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-02-24 DOI: 10.1177/00131644251319047
Sevilay Kilmen, Okan Bulut
{"title":"Shortening Psychological Scales: Semantic Similarity Matters.","authors":"Sevilay Kilmen, Okan Bulut","doi":"10.1177/00131644251319047","DOIUrl":"10.1177/00131644251319047","url":null,"abstract":"<p><p>In this study, we proposed a novel scale abbreviation method based on sentence embeddings and compared it to two established automatic scale abbreviation techniques. Scale abbreviation methods typically rely on administering the full scale to a large representative sample, which is often impractical in certain settings. Our approach leverages the semantic similarity among the items to select abbreviated versions of scales without requiring response data, offering a practical alternative for scale development. We found that the sentence embedding method performs comparably to the data-driven scale abbreviation approaches in terms of model fit, measurement accuracy, and ability estimates. In addition, our results reveal a moderate negative correlation between item discrimination parameters and semantic similarity indices, suggesting that semantically unique items may result in a higher discrimination power. This supports the notion that semantic features can be predictive of psychometric properties. However, this relationship was not observed for reverse-scored items, which may require further investigation. Overall, our findings suggest that the sentence embedding approach offers a promising solution for scale abbreviation, particularly in situations where large sample sizes are unavailable, and may eventually serve as an alternative to traditional data-driven methods.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251319047"},"PeriodicalIF":2.1,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851598/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143515073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution. 用数据中的ω系数高估内部一致性,从而产生类质心因子解。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-02-13 DOI: 10.1177/00131644241313447
Karl Schweizer, Tengfei Wang, Xuezhu Ren
{"title":"Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution.","authors":"Karl Schweizer, Tengfei Wang, Xuezhu Ren","doi":"10.1177/00131644241313447","DOIUrl":"10.1177/00131644241313447","url":null,"abstract":"<p><p>Coefficient Omega measuring internal consistency is investigated for its deviations from expected outcomes when applied to correlational patterns that produce variable-focused factor solutions in confirmatory factor analysis. In these solutions, the factor loadings on the factor of the one-factor measurement model closely correspond to the correlations of one manifest variable with the other manifest variables, as is in centroid solutions. It is demonstrated that in such a situation, a heterogeneous correlational pattern leads to an Omega estimate larger than those for similarly heterogeneous and uniform patterns. A simulation study reveals that these deviations are restricted to datasets including small numbers of manifest variables and that the degree of heterogeneity determines the degree of deviation. We propose a method for identifying variable-focused factor solutions and how to deal with deviations.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241313447"},"PeriodicalIF":2.1,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143432505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution. 利用后验正态分布获得系数Alpha的贝叶斯估计。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-01-31 DOI: 10.1177/00131644241311877
John Mart V DelosReyes, Miguel A Padilla
{"title":"Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution.","authors":"John Mart V DelosReyes, Miguel A Padilla","doi":"10.1177/00131644241311877","DOIUrl":"10.1177/00131644241311877","url":null,"abstract":"<p><p>A new alternative to obtain a Bayesian estimate of coefficient alpha through a posterior normal distribution is proposed and assessed through percentile, normal-theory-based, and highest probability density credible intervals in a simulation study. The results indicate that the proposed Bayesian method to estimate coefficient alpha has acceptable coverage probability performance across the majority of investigated simulation conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241311877"},"PeriodicalIF":2.1,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores. 建构反应成就测验项目分数的教学敏感性检验。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-01-30 DOI: 10.1177/00131644241313212
Anne Traynor, Cheng-Hsien Li, Shuqi Zhou
{"title":"Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores.","authors":"Anne Traynor, Cheng-Hsien Li, Shuqi Zhou","doi":"10.1177/00131644241313212","DOIUrl":"10.1177/00131644241313212","url":null,"abstract":"<p><p>Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the implemented curriculum) should affect test-takers' item responses. Existing research has begun to identify patterns in the content of instructionally sensitive multiple-choice achievement test items. To inform future test design decisions, this study identified instructionally (in)sensitive constructed-response achievement items, then characterized features of those items and their corresponding scoring rubrics. First, we used simulation to evaluate an item step difficulty difference index for constructed-response test items, derived from the generalized partial credit model. The statistical performance of the index was adequate, so we then applied it to data from 32 constructed-response eighth-grade science test items. We found that the instructional sensitivity (IS) index values varied appreciably across the category boundaries within an item as well as across items. Content analysis by master science teachers allowed us to identify general features of item score categories that show high, or negligible, IS.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241313212"},"PeriodicalIF":2.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Impact of Attentiveness Interventions on Survey Data. 注意力干预对调查数据的影响。
IF 2.1 3区 心理学
Educational and Psychological Measurement Pub Date : 2025-01-29 DOI: 10.1177/00131644241311851
Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall
{"title":"The Impact of Attentiveness Interventions on Survey Data.","authors":"Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall","doi":"10.1177/00131644241311851","DOIUrl":"10.1177/00131644241311851","url":null,"abstract":"<p><p>Social and behavioral science researchers who use survey data are vigilant about data quality, with an increasing emphasis on avoiding common method variance (CMV) and insufficient effort responding (IER). Each of these errors can inflate and deflate substantive relationships, and there are both a priori and post hoc means to address them. Yet, little research has investigated how both IER and CMV are affected with the use of these different procedural or statistical techniques used to address them. More specifically, if interventions to reduce IER are used, does this affect CMV in data? In an experiment conducted both in and out of the laboratory, we investigate the impact of attentiveness interventions, such as a Factual Manipulation Check (FMC) on both IER and CMV in same-source survey data. In addition to typical IER measures, we also track whether respondents play the instructional video and their mouse movement. The results show that while interventions have some impact on the level of participant attentiveness, these interventions do not appear to lead to differing levels of CMV.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241311851"},"PeriodicalIF":2.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775934/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143064490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信