{"title":"Using Biclustering to Detect Cheating in Real Time on Mixed-Format Tests.","authors":"Hyeryung Lee, Walter P Vispoel","doi":"10.1177/00131644251333143","DOIUrl":"https://doi.org/10.1177/00131644251333143","url":null,"abstract":"<p><p>We evaluated a real-time biclustering method for detecting cheating on mixed-format assessments that included dichotomous, polytomous, and multi-part items. Biclustering jointly groups examinees and items by identifying subgroups of test takers who exhibit similar response patterns on specific subsets of items. This method's flexibility and minimal assumptions about examinee behavior make it computationally efficient and highly adaptable. To further finetune accuracy and reduce false positives in real-time detection, enhanced statistical significance tests were incorporated into the illustrated algorithms. Two simulation studies were conducted to assess detection across varying testing conditions. In the first study, the method effectively detected cheating on tests composed entirely of either dichotomous or non-dichotomous items. In the second study, we examined tests with varying mixed item formats and again observed strong detection performance. In both studies, detection performance was examined at each timestamp in real time and evaluated under three varying conditions: proportion of cheaters, cheating group size, and proportion of compromised items. Across conditions, the method demonstrated strong computational efficiency, underscoring its suitability for real-time applications. Overall, these results highlight the adaptability, versatility, and effectiveness of biclustering in detecting cheating in real time while maintaining low false-positive rates.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251333143"},"PeriodicalIF":2.1,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12104213/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144156794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Deep Reinforcement Learning to Decide Test Length.","authors":"James Zoucha, Igor Himelfarb, Nai-En Tang","doi":"10.1177/00131644251332972","DOIUrl":"https://doi.org/10.1177/00131644251332972","url":null,"abstract":"<p><p>This study explored the application of deep reinforcement learning (DRL) as an innovative approach to optimize test length. The primary focus was to evaluate whether the current length of the National Board of Chiropractic Examiners Part I Exam is justified. By modeling the problem as a combinatorial optimization task within a Markov Decision Process framework, an algorithm capable of constructing test forms from a finite set of items while adhering to critical structural constraints, such as content representation and item difficulty distribution, was used. The findings reveal that although the DRL algorithm was successful in identifying shorter test forms that maintained comparable ability estimation accuracy, the existing test length of 240 items remains advisable as we found shorter test forms did not maintain structural constraints. Furthermore, the study highlighted the inherent adaptability of DRL to continuously learn about a test-taker's latent abilities and dynamically adjust to their response patterns, making it well-suited for personalized testing environments. This dynamic capability supports real-time decision-making in item selection, improving both efficiency and precision in ability estimation. Future research is encouraged to focus on expanding the item bank and leveraging advanced computational resources to enhance the algorithm's search capacity for shorter, structurally compliant test forms.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251332972"},"PeriodicalIF":2.1,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12049363/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143988676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating Change in Adjusted <i>R</i>-Square and <i>R</i>-Square Indices: A Latent Variable Method Application.","authors":"Tenko Raykov, Christine DiStefano","doi":"10.1177/00131644251329178","DOIUrl":"https://doi.org/10.1177/00131644251329178","url":null,"abstract":"<p><p>A procedure for interval estimation of the difference in the adjusted <i>R</i>-square index for nested linear models is discussed. The method yields as a byproduct confidence intervals for their standard <i>R</i>-square difference, as well as for the adjusted and standard <i>R</i>-squares associated with each model. The resulting interval estimate of the difference in adjusted <i>R</i>-square represents a useful and informative complement to the commonly used <i>R</i>-square change statistic and its significance test in model selection and contains substantially more information than that test. The outlined procedure is readily employed with popular software in empirical educational and psychological studies and is illustrated with numerical data.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251329178"},"PeriodicalIF":2.1,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993540/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143985479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing the Performance of Strategies for Handling Rapid Guessing Responses in Item Response Theory Equating.","authors":"Juyoung Jung, Won-Chan Lee","doi":"10.1177/00131644251329524","DOIUrl":"10.1177/00131644251329524","url":null,"abstract":"<p><p>This study assesses the performance of strategies for handling rapid guessing responses (RGs) within the context of item response theory observed-score equating. Four distinct approaches were evaluated: (1) ignoring RGs, (2) penalizing RGs as incorrect responses, (3) implementing list-wise deletion (LWD), and (4) treating RGs as missing data followed by imputation using logistic regression-based methodologies. These strategies were examined across a diverse array of testing scenarios. Results indicate that the performance of each strategy varied depending on the specific manipulated factors. Both ignoring and penalizing RGs were found to introduce substantial distortions in equating accuracy. LWD generally exhibited the lowest bias among the strategies evaluated but showed higher standard errors. Data imputation methods, particularly those employing lasso logistic regression and bootstrap techniques, demonstrated superior performance in minimizing equating errors compared to other approaches.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251329524"},"PeriodicalIF":2.1,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11955993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143763405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pere J Ferrando, David Navarro-González, Fabia Morales-Vives
{"title":"Assessing the Properties and Functioning of Model-Based Sum Scores in Multidimensional Measures With Local Item Dependencies: A Comprehensive Proposal.","authors":"Pere J Ferrando, David Navarro-González, Fabia Morales-Vives","doi":"10.1177/00131644251319286","DOIUrl":"https://doi.org/10.1177/00131644251319286","url":null,"abstract":"<p><p>A common problem in the assessment of noncognitive attributes is the presence of items with correlated residuals. Although most studies have focused on their effect at the structural level, they may also have an effect on the accuracy and effectiveness of the scores derived from extended factor analytic (FA) solutions which include correlated residuals. For this reason, several measures of reliability/factor saturation and information were developed in a previous study to assess this effect in sum scores derived from unidimensional measures based on both linear and nonlinear FA solutions. The current article extends these proposals to a second-order solution with a single general factor, and it also extends the added-value principle to the second-order scenario when local dependences are operating. Related to the added-value, a new coefficient is developed (an effect-size index and its confidence intervals). Overall, what is proposed allows first to assess the reliability and relative efficiency of the scores at both the subscale and total scale levels, and second, provides information on the appropriateness of using subscale scores to predict their own factor in comparison to the predictive capacity of the total score. All that is proposed is implemented in a freely available R program. Its usefulness is illustrated with an empirical example, which shows the distortions that correlated residuals may cause and how the various measures included in this proposal should be interpreted.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251319286"},"PeriodicalIF":2.1,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11907499/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143647648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shortening Psychological Scales: Semantic Similarity Matters.","authors":"Sevilay Kilmen, Okan Bulut","doi":"10.1177/00131644251319047","DOIUrl":"10.1177/00131644251319047","url":null,"abstract":"<p><p>In this study, we proposed a novel scale abbreviation method based on sentence embeddings and compared it to two established automatic scale abbreviation techniques. Scale abbreviation methods typically rely on administering the full scale to a large representative sample, which is often impractical in certain settings. Our approach leverages the semantic similarity among the items to select abbreviated versions of scales without requiring response data, offering a practical alternative for scale development. We found that the sentence embedding method performs comparably to the data-driven scale abbreviation approaches in terms of model fit, measurement accuracy, and ability estimates. In addition, our results reveal a moderate negative correlation between item discrimination parameters and semantic similarity indices, suggesting that semantically unique items may result in a higher discrimination power. This supports the notion that semantic features can be predictive of psychometric properties. However, this relationship was not observed for reverse-scored items, which may require further investigation. Overall, our findings suggest that the sentence embedding approach offers a promising solution for scale abbreviation, particularly in situations where large sample sizes are unavailable, and may eventually serve as an alternative to traditional data-driven methods.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644251319047"},"PeriodicalIF":2.1,"publicationDate":"2025-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11851598/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143515073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Overestimation of Internal Consistency by Coefficient Omega in Data Giving Rise to a Centroid-Like Factor Solution.","authors":"Karl Schweizer, Tengfei Wang, Xuezhu Ren","doi":"10.1177/00131644241313447","DOIUrl":"10.1177/00131644241313447","url":null,"abstract":"<p><p>Coefficient Omega measuring internal consistency is investigated for its deviations from expected outcomes when applied to correlational patterns that produce variable-focused factor solutions in confirmatory factor analysis. In these solutions, the factor loadings on the factor of the one-factor measurement model closely correspond to the correlations of one manifest variable with the other manifest variables, as is in centroid solutions. It is demonstrated that in such a situation, a heterogeneous correlational pattern leads to an Omega estimate larger than those for similarly heterogeneous and uniform patterns. A simulation study reveals that these deviations are restricted to datasets including small numbers of manifest variables and that the degree of heterogeneity determines the degree of deviation. We propose a method for identifying variable-focused factor solutions and how to deal with deviations.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241313447"},"PeriodicalIF":2.1,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826816/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143432505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Obtaining a Bayesian Estimate of Coefficient Alpha Using a Posterior Normal Distribution.","authors":"John Mart V DelosReyes, Miguel A Padilla","doi":"10.1177/00131644241311877","DOIUrl":"10.1177/00131644241311877","url":null,"abstract":"<p><p>A new alternative to obtain a Bayesian estimate of coefficient alpha through a posterior normal distribution is proposed and assessed through percentile, normal-theory-based, and highest probability density credible intervals in a simulation study. The results indicate that the proposed Bayesian method to estimate coefficient alpha has acceptable coverage probability performance across the majority of investigated simulation conditions.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241311877"},"PeriodicalIF":2.1,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11786261/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Examining the Instructional Sensitivity of Constructed-Response Achievement Test Item Scores.","authors":"Anne Traynor, Cheng-Hsien Li, Shuqi Zhou","doi":"10.1177/00131644241313212","DOIUrl":"10.1177/00131644241313212","url":null,"abstract":"<p><p>Inferences about student learning from large-scale achievement test scores are fundamental in education. For achievement test scores to provide useful information about student learning progress, differences in the content of instruction (i.e., the implemented curriculum) should affect test-takers' item responses. Existing research has begun to identify patterns in the content of instructionally sensitive multiple-choice achievement test items. To inform future test design decisions, this study identified instructionally (in)sensitive constructed-response achievement items, then characterized features of those items and their corresponding scoring rubrics. First, we used simulation to evaluate an item step difficulty difference index for constructed-response test items, derived from the generalized partial credit model. The statistical performance of the index was adequate, so we then applied it to data from 32 constructed-response eighth-grade science test items. We found that the instructional sensitivity (IS) index values varied appreciably across the category boundaries within an item as well as across items. Content analysis by master science teachers allowed us to identify general features of item score categories that show high, or negligible, IS.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241313212"},"PeriodicalIF":2.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143079163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall
{"title":"The Impact of Attentiveness Interventions on Survey Data.","authors":"Christie M Fuller, Marcia J Simmering, Brian Waterwall, Elizabeth Ragland, Douglas P Twitchell, Alison Wall","doi":"10.1177/00131644241311851","DOIUrl":"10.1177/00131644241311851","url":null,"abstract":"<p><p>Social and behavioral science researchers who use survey data are vigilant about data quality, with an increasing emphasis on avoiding common method variance (CMV) and insufficient effort responding (IER). Each of these errors can inflate and deflate substantive relationships, and there are both a priori and post hoc means to address them. Yet, little research has investigated how both IER and CMV are affected with the use of these different procedural or statistical techniques used to address them. More specifically, if interventions to reduce IER are used, does this affect CMV in data? In an experiment conducted both in and out of the laboratory, we investigate the impact of attentiveness interventions, such as a Factual Manipulation Check (FMC) on both IER and CMV in same-source survey data. In addition to typical IER measures, we also track whether respondents play the instructional video and their mouse movement. The results show that while interventions have some impact on the level of participant attentiveness, these interventions do not appear to lead to differing levels of CMV.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":" ","pages":"00131644241311851"},"PeriodicalIF":2.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11775934/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143064490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}