Applied Psychological Measurement最新文献

筛选
英文 中文
Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions. 不同条件下差异化项目功能对计算机自适应测试的影响
IF 1 4区 心理学
Applied Psychological Measurement Pub Date : 2024-11-01 Epub Date: 2024-09-17 DOI: 10.1177/01466216241284295
Merve Sahin Kursad, Seher Yalcin
{"title":"Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions.","authors":"Merve Sahin Kursad, Seher Yalcin","doi":"10.1177/01466216241284295","DOIUrl":"10.1177/01466216241284295","url":null,"abstract":"<p><p>This study provides an overview of the effect of differential item functioning (DIF) on measurement precision, test information function (TIF), and test effectiveness in computer adaptive tests (CATs). Simulated data for the study was produced and analyzed with the Rstudio. During the data generation process, item pool size, DIF type, DIF percentage, item selection method for CAT, and the test termination rules were considered changed conditions. Sample size and ability parameter distribution, Item Response Theory (IRT) model, DIF size, ability estimation method, test starting rule, and item usage frequency method regarding CAT conditions were considered fixed conditions. To examine the effect of DIF, measurement precision, TIF and test effectiveness were calculated. Results show DIF has negative effects on measurement precision, TIF, and test effectiveness. In particular, statistically significant effects of the percentage DIF items and DIF type are observed on measurement precision.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"48 7-8","pages":"303-322"},"PeriodicalIF":1.0,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11501093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142510498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Improved EMS Algorithm for Latent Variable Selection in M3PL Model. 用于 M3PL 模型中潜在变量选择的改进 EMS 算法。
IF 1 4区 心理学
Applied Psychological Measurement Pub Date : 2024-10-21 DOI: 10.1177/01466216241291237
Laixu Shang, Ping-Feng Xu, Na Shan, Man-Lai Tang, Qian-Zhen Zheng
{"title":"The Improved EMS Algorithm for Latent Variable Selection in M3PL Model.","authors":"Laixu Shang, Ping-Feng Xu, Na Shan, Man-Lai Tang, Qian-Zhen Zheng","doi":"10.1177/01466216241291237","DOIUrl":"10.1177/01466216241291237","url":null,"abstract":"<p><p>One of the main concerns in multidimensional item response theory (MIRT) is to detect the relationship between items and latent traits, which can be treated as a latent variable selection problem. An attractive method for latent variable selection in multidimensional 2-parameter logistic (M2PL) model is to minimize the observed Bayesian information criterion (BIC) by the expectation model selection (EMS) algorithm. The EMS algorithm extends the EM algorithm and allows the updates of the model (e.g., the loading structure in MIRT) in the iterations along with the parameters under the model. As an extension of the M2PL model, the multidimensional 3-parameter logistic (M3PL) model introduces an additional guessing parameter which makes the latent variable selection more challenging. In this paper, a well-designed EMS algorithm, named improved EMS (IEMS), is proposed to accurately and efficiently detect the underlying true loading structure in the M3PL model, which also works for the M2PL model. In simulation studies, we compare the IEMS algorithm with several state-of-art methods and the IEMS is of competitiveness in terms of model recovery, estimation precision, and computational efficiency. The IEMS algorithm is illustrated by its application to two real data sets.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241291237"},"PeriodicalIF":1.0,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11559968/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Test Design for Estimation of Mean Ability Growth. 估计平均能力增长的最佳测试设计。
IF 1 4区 心理学
Applied Psychological Measurement Pub Date : 2024-10-15 DOI: 10.1177/01466216241291233
Jonas Bjermo
{"title":"Optimal Test Design for Estimation of Mean Ability Growth.","authors":"Jonas Bjermo","doi":"10.1177/01466216241291233","DOIUrl":"10.1177/01466216241291233","url":null,"abstract":"<p><p>The design of an achievement test is crucial for many reasons. This article focuses on a population's ability growth between school grades. We define design as the allocating of test items concerning the difficulties. The objective is to present an optimal test design method for estimating the mean and percentile ability growth with good precision. We use the asymptotic expression of the variance in terms of the test information. With that criterion for optimization, we propose to use particle swarm optimization to find the optimal design. The results show that the allocation of the item difficulties depends on item discrimination and the magnitude of the ability growth. The optimization function depends on the examinees' abilities, hence, the value of the unknown mean ability growth. Therefore, we will also use an optimum in-average design and conclude that it is robust to uncertainty in the mean ability growth. A test is, in practice, assembled from items stored in an item pool with calibrated item parameters. Hence, we also perform a discrete optimization using simulated annealing and compare the results to the particle swarm optimization.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241291233"},"PeriodicalIF":1.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560061/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Two-Step Q-Matrix Estimation Method. 两步 Q 矩阵估算法
IF 1 4区 心理学
Applied Psychological Measurement Pub Date : 2024-10-10 DOI: 10.1177/01466216241284418
Hans-Friedrich Köhn, Chia-Yi Chiu, Olasumbo Oluwalana, Hyunjoo Kim, Jiaxi Wang
{"title":"A Two-Step Q-Matrix Estimation Method.","authors":"Hans-Friedrich Köhn, Chia-Yi Chiu, Olasumbo Oluwalana, Hyunjoo Kim, Jiaxi Wang","doi":"10.1177/01466216241284418","DOIUrl":"10.1177/01466216241284418","url":null,"abstract":"<p><p>Cognitive Diagnosis Models in educational measurement are restricted latent class models that describe ability in a knowledge domain as a composite of latent skills an examinee may have mastered or failed. Different combinations of skills define distinct latent proficiency classes to which examinees are assigned based on test performance. Items of cognitively diagnostic assessments are characterized by skill profiles specifying which skills are required for a correct item response. The item-skill profiles of a test form its Q-matrix. The validity of cognitive diagnosis depends crucially on the correct specification of the Q-matrix. Typically, Q-matrices are determined by curricular experts. However, expert judgment is fallible. Data-driven estimation methods have been developed with the promise of greater accuracy in identifying the Q-matrix of a test. Yet, many of the extant methods encounter computational feasibility issues either in the form of excessive amounts of CPU times or inadmissible estimates. In this article, a two-step algorithm for estimating the Q-matrix is proposed that can be used with any cognitive diagnosis model. Simulations showed that the new method outperformed extant estimation algorithms and was computationally more efficient. It was also applied to Tatsuoka's famous fraction-subtraction data. The paper concludes with a discussion of theoretical and practical implications of the findings.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":" ","pages":"01466216241284418"},"PeriodicalIF":1.0,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560062/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142630379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Item Response Modeling of Clinical Instruments With Filter Questions: Disentangling Symptom Presence and Severity. 带有过滤器问题的临床工具的项目反应模型:区分症状的存在与严重程度
IF 1 4区 心理学
Applied Psychological Measurement Pub Date : 2024-09-01 Epub Date: 2024-06-17 DOI: 10.1177/01466216241261709
Brooke E Magnus
{"title":"Item Response Modeling of Clinical Instruments With Filter Questions: Disentangling Symptom Presence and Severity.","authors":"Brooke E Magnus","doi":"10.1177/01466216241261709","DOIUrl":"10.1177/01466216241261709","url":null,"abstract":"<p><p>Clinical instruments that use a filter/follow-up response format often produce data with excess zeros, especially when administered to nonclinical samples. When the unidimensional graded response model (GRM) is then fit to these data, parameter estimates and scale scores tend to suggest that the instrument measures individual differences only among individuals with severe levels of the psychopathology. In such scenarios, alternative item response models that explicitly account for excess zeros may be more appropriate. The multivariate hurdle graded response model (MH-GRM), which has been previously proposed for handling zero-inflated questionnaire data, includes two latent variables: susceptibility, which underlies responses to the filter question, and severity, which underlies responses to the follow-up question. Using both simulated and empirical data, the current research shows that compared to unidimensional GRMs, the MH-GRM is better able to capture individual differences across a wider range of psychopathology, and that when unidimensional GRMs are fit to data from questionnaires that include filter questions, individual differences at the lower end of the severity continuum largely go unmeasured. Practical implications are discussed.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"48 6","pages":"235-256"},"PeriodicalIF":1.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11331747/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142009739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using Auxiliary Item Information in the Item Parameter Estimation of a Graded Response Model for a Small to Medium Sample Size: Empirical Versus Hierarchical Bayes Estimation 在小到中等样本量的分级反应模型的项目参数估计中使用辅助项目信息:经验与层次贝叶斯估计
4区 心理学
Applied Psychological Measurement Pub Date : 2023-11-03 DOI: 10.1177/01466216231209758
Matthew Naveiras, Sun-Joo Cho
{"title":"Using Auxiliary Item Information in the Item Parameter Estimation of a Graded Response Model for a Small to Medium Sample Size: Empirical Versus Hierarchical Bayes Estimation","authors":"Matthew Naveiras, Sun-Joo Cho","doi":"10.1177/01466216231209758","DOIUrl":"https://doi.org/10.1177/01466216231209758","url":null,"abstract":"Marginal maximum likelihood estimation (MMLE) is commonly used for item response theory item parameter estimation. However, sufficiently large sample sizes are not always possible when studying rare populations. In this paper, empirical Bayes and hierarchical Bayes are presented as alternatives to MMLE in small sample sizes, using auxiliary item information to estimate the item parameters of a graded response model with higher accuracy. Empirical Bayes and hierarchical Bayes methods are compared with MMLE to determine under what conditions these Bayes methods can outperform MMLE, and to determine if hierarchical Bayes can act as an acceptable alternative to MMLE in conditions where MMLE is unable to converge. In addition, empirical Bayes and hierarchical Bayes methods are compared to show how hierarchical Bayes can result in estimates of posterior variance with greater accuracy than empirical Bayes by acknowledging the uncertainty of item parameter estimates. The proposed methods were evaluated via a simulation study. Simulation results showed that hierarchical Bayes methods can be acceptable alternatives to MMLE under various testing conditions, and we provide a guideline to indicate which methods would be recommended in different research situations. R functions are provided to implement these proposed methods.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"21 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135819514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Bayesian Random Weights Linear Logistic Test Model for Within-Test Practice Effects 测试内实践效果的贝叶斯随机权重线性Logistic检验模型
4区 心理学
Applied Psychological Measurement Pub Date : 2023-11-01 DOI: 10.1177/01466216231209752
José H. Lozano, Javier Revuelta
{"title":"A Bayesian Random Weights Linear Logistic Test Model for Within-Test Practice Effects","authors":"José H. Lozano, Javier Revuelta","doi":"10.1177/01466216231209752","DOIUrl":"https://doi.org/10.1177/01466216231209752","url":null,"abstract":"The present paper introduces a random weights linear logistic test model for the measurement of individual differences in operation-specific practice effects within a single administration of a test. The proposed model is an extension of the linear logistic test model of learning developed by Spada (1977) in which the practice effects are considered random effects varying across examinees. A Bayesian framework was used for model estimation and evaluation. A simulation study was conducted to examine the behavior of the model in combination with the Bayesian procedures. The results demonstrated the good performance of the estimation and evaluation methods. Additionally, an empirical study was conducted to illustrate the applicability of the model to real data. The model was applied to a sample of responses from a logical ability test providing evidence of individual differences in operation-specific practice effects.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"193 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135371926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlling the Minimum Item Exposure Rate in Computerized Adaptive Testing: A Two-Stage Sympson–Hetter Procedure 计算机化自适应测验中控制最小项目曝光率:两阶段症状改善程序
4区 心理学
Applied Psychological Measurement Pub Date : 2023-10-20 DOI: 10.1177/01466216231209756
Hsiu-Yi Chao, Jyun-Hong Chen
{"title":"Controlling the Minimum Item Exposure Rate in Computerized Adaptive Testing: A Two-Stage Sympson–Hetter Procedure","authors":"Hsiu-Yi Chao, Jyun-Hong Chen","doi":"10.1177/01466216231209756","DOIUrl":"https://doi.org/10.1177/01466216231209756","url":null,"abstract":"Computerized adaptive testing (CAT) can improve test efficiency, but it also causes the problem of unbalanced item usage within a pool. The effect of uneven item exposure rates can not only induce a test security problem due to overexposed items but also raise economic concerns about item pool development due to underexposed items. Therefore, this study proposes a two-stage Sympson–Hetter (TSH) method to enhance balanced item pool utilization by simultaneously controlling the minimum and maximum item exposure rates. The TSH method divides CAT into two stages. While the item exposure rates are controlled above a prespecified level (e.g., r min ) in the first stage to increase the exposure rates of the underexposed items, they are controlled below another prespecified level (e.g., r max ) in the second stage to prevent items from overexposure. To reduce the effect on trait estimation, TSH only administers a minimum sufficient number of underexposed items that are generally less discriminating in the first stage of CAT. The simulation study results indicate that the TSH method can effectively improve item pool usage without clearly compromising trait estimation precision in most conditions while maintaining the required level of test security.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"18 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135567498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Two Statistics for Measuring the Score Comparability of Computerized Adaptive Tests 计算机化自适应测验分数可比性的两种统计方法
4区 心理学
Applied Psychological Measurement Pub Date : 2023-10-19 DOI: 10.1177/01466216231209749
Adam E. Wyse
{"title":"Two Statistics for Measuring the Score Comparability of Computerized Adaptive Tests","authors":"Adam E. Wyse","doi":"10.1177/01466216231209749","DOIUrl":"https://doi.org/10.1177/01466216231209749","url":null,"abstract":"This study introduces two new statistics for measuring the score comparability of computerized adaptive tests (CATs) based on comparing conditional standard errors of measurement (CSEMs) for examinees that achieved the same scale scores. One statistic is designed to evaluate score comparability of alternate CAT forms for individual scale scores, while the other statistic is designed to evaluate the overall score comparability of alternate CAT forms. The effectiveness of the new statistics is illustrated using data from grade 3 through 8 reading and math CATs. Results suggest that both CATs demonstrated reasonably high levels of score comparability, that score comparability was less at very high or low scores where few students score, and that using random samples with fewer students per grade did not have a big impact on score comparability. Results also suggested that score comparability was sometimes higher when the bottom 20% of scorers were used to calculate overall score comparability compared to all students. Additional discussion related to applying the statistics in different contexts is provided.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135780396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests 项目反应理论核等价在混合格式测试中的有效性分析
4区 心理学
Applied Psychological Measurement Pub Date : 2023-10-19 DOI: 10.1177/01466216231209757
Joakim Wallmark, Maria Josefsson, Marie Wiberg
{"title":"Efficiency Analysis of Item Response Theory Kernel Equating for Mixed-Format Tests","authors":"Joakim Wallmark, Maria Josefsson, Marie Wiberg","doi":"10.1177/01466216231209757","DOIUrl":"https://doi.org/10.1177/01466216231209757","url":null,"abstract":"This study aims to evaluate the performance of Item Response Theory (IRT) kernel equating in the context of mixed-format tests by comparing it to IRT observed score equating and kernel equating with log-linear presmoothing. Comparisons were made through both simulations and real data applications, under both equivalent groups (EG) and non-equivalent groups with anchor test (NEAT) sampling designs. To prevent bias towards IRT methods, data were simulated with and without the use of IRT models. The results suggest that the difference between IRT kernel equating and IRT observed score equating is minimal, both in terms of the equated scores and their standard errors. The application of IRT models for presmoothing yielded smaller standard error of equating than the log-linear presmoothing approach. When test data were generated using IRT models, IRT-based methods proved less biased than log-linear kernel equating. However, when data were simulated without IRT models, log-linear kernel equating showed less bias. Overall, IRT kernel equating shows great promise when equating mixed-format tests.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135779713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信