International Journal of Testing最新文献_第10页

ITC Guidelines for Translating and Adapting Tests (Second Edition) ITC翻译和改编测试指南（第二版）

IF 1.7

International Journal of Testing Pub Date : 2018-04-03 DOI: 10.1080/15305058.2017.1398166

引用次数: 403

Detecting Curvilinear Relationships: A Comparison of Scoring Approaches Based on Different Item Response Models 检测曲线关系：基于不同项目反应模型的评分方法比较

IF 1.7

International Journal of Testing Pub Date : 2018-04-03 DOI: 10.1080/15305058.2017.1345913

Mengyang Cao, Q. Song, L. Tay

{"title":"Detecting Curvilinear Relationships: A Comparison of Scoring Approaches Based on Different Item Response Models","authors":"Mengyang Cao, Q. Song, L. Tay","doi":"10.1080/15305058.2017.1345913","DOIUrl":"https://doi.org/10.1080/15305058.2017.1345913","url":null,"abstract":"There is a growing use of noncognitive assessments around the world, and recent research has posited an ideal point response process underlying such measures. A critical issue is whether the typical use of dominance approaches (e.g., average scores, factor analysis, and the Samejima's graded response model) in scoring such measures is adequate. This study examined the performance of an ideal point scoring approach (e.g., the generalized graded unfolding model) as compared to the typical dominance scoring approaches in detecting curvilinear relationships between scored trait and external variable. Simulation results showed that when data followed the ideal point model, the ideal point approach generally exhibited more power and provided more accurate estimates of curvilinear effects than the dominance approaches. No substantial difference was found between ideal point and dominance scoring approaches in terms of Type I error rate and bias across different sample sizes and scale lengths, although skewness in the distribution of trait and external variable can potentially reduce statistical power. For dominance data, the ideal point scoring approach exhibited convergence problems in most conditions and failed to perform as well as the dominance scoring approaches. Practical implications for scoring responses to Likert-type surveys to examine curvilinear effects are discussed.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"178 - 205"},"PeriodicalIF":1.7,"publicationDate":"2018-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2017.1345913","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49612645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Response Time Based Nonparametric Kullback-Leibler Divergence Measure for Detecting Aberrant Test-Taking Behavior 基于响应时间的非参数Kullback-Leibler散度测度检测异常应试行为

IF 1.7

International Journal of Testing Pub Date : 2018-02-28 DOI: 10.1080/15305058.2018.1429446

K. Man, Jeffery R. Harring, Yunbo Ouyang, Sarah L. Thomas

引用次数: 19

FIPC Linking Across Multidimensional Test Forms: Effects of Confounding Difficulty within Dimensions 跨多维测试形式的FIPC连接:维度内混淆难度的影响

IF 1.7

International Journal of Testing Pub Date : 2018-02-27 DOI: 10.1080/15305058.2018.1428980

S. Kim, Ki Cole, M. Mwavita

{"title":"FIPC Linking Across Multidimensional Test Forms: Effects of Confounding Difficulty within Dimensions","authors":"S. Kim, Ki Cole, M. Mwavita","doi":"10.1080/15305058.2018.1428980","DOIUrl":"https://doi.org/10.1080/15305058.2018.1428980","url":null,"abstract":"This study investigated the effects of linking potentially multidimensional test forms using the fixed item parameter calibration. Forms had equal or unequal total test difficulty with and without confounding difficulty. The mean square errors and bias of estimated item and ability parameters were compared across the various confounding tests. The estimated discrimination parameters were influenced by the levels of correlation between dimensions. The mean square errors (MSEs) of the average of the true discrimination parameters with the estimated value were smallest when the correlation equaled 0; however, the MSEs of the multidimensional discrimination parameter were smallest when the correlation was larger than 0. The estimated difficulty parameters were highly affected by different amount of confounding difficulty within dimensions. Furthermore, the MSEs of the average of the true ability parameters on the first and second dimensions with the estimated ability were smaller than those from the ability parameter on each dimension for all conditions. The pattern varied according to the number of common items, and the measures of MSE and squared bias were relatively consistent across forms at the same level of correlation, except for the condition where the correlation was 0 and the number of common items was 8.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"323 - 345"},"PeriodicalIF":1.7,"publicationDate":"2018-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1428980","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"59952903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Effects of Situational Judgment Test Format on Reliability and Validity 情境判断测试格式对信度和有效性的影响

IF 1.7

International Journal of Testing Pub Date : 2018-02-23 DOI: 10.1080/15305058.2018.1428981

Michelle P. Martín‐Raugh, Cristina Anguiano-Carrsaco, Teresa Jackson, Meghan W. Brenneman, Lauren M. Carney, Patrick V. Barnwell, Jonathan F. Kochert

引用次数: 1

Adding Value to Second-Language Listening and Reading Subscores: Using a Score Augmentation Approach 增加第二语言听力和阅读分数的价值:使用分数增加法

IF 1.7

International Journal of Testing Pub Date : 2018-01-04 DOI: 10.1080/15305058.2017.1407766

S. Papageorgiou, Ikkyu Choi

{"title":"Adding Value to Second-Language Listening and Reading Subscores: Using a Score Augmentation Approach","authors":"S. Papageorgiou, Ikkyu Choi","doi":"10.1080/15305058.2017.1407766","DOIUrl":"https://doi.org/10.1080/15305058.2017.1407766","url":null,"abstract":"This study examined whether reporting subscores for groups of items within a test section assessing a second-language modality (specifically reading or listening comprehension) added value from a measurement perspective to the information already provided by the section scores. We analyzed the responses of 116,489 test takers to reading and listening items from operational administrations of two large-scale international tests of English as a foreign language. To “strengthen” the reliability of the subscores, and thus improve their added value, we applied a score augmentation method (Haberman, 2008). In doing so, our aim was to examine whether reporting augmented subscores for specific groups of reading and listening items could improve the added value of these subscores and consequently justify providing more fine-grained information about test taker performance. Our analysis indicated that in general, there was lack of support for reporting subscores from a psychometric perspective, and that score augmentation marginally improved the added value of the subscores. We discuss several implications of our findings for test developers wishing to report more fine-grained information about test performance. We conclude by arguing that research on how to best report such refined feedback should remain the focus of future efforts related to second-language proficiency tests.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"207 - 230"},"PeriodicalIF":1.7,"publicationDate":"2018-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2017.1407766","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43463605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Exploring a Source of Uneven Score Equity across the Test Score Range 探索考试成绩不均衡的根源

IF 1.7

International Journal of Testing Pub Date : 2018-01-02 DOI: 10.1080/15305058.2017.1396463

A. Huggins-Manley, Yuxi Qiu, Randall D. Penfield

引用次数: 1

Spurious Latent Class Problem in the Mixed Rasch Model: A Comparison of Three Maximum Likelihood Estimation Methods under Different Ability Distributions 混合Rasch模型中的伪潜在类问题：不同能力分布下三种最大似然估计方法的比较

IF 1.7

International Journal of Testing Pub Date : 2018-01-02 DOI: 10.1080/15305058.2017.1312408

S. Şen

引用次数: 6

The Influence of Rater Effects in Training Sets on the Psychometric Quality of Automated Scoring for Writing Assessments 训练集中评分者效应对写作评估自动评分心理测量质量的影响

IF 1.7

International Journal of Testing Pub Date : 2018-01-02 DOI: 10.1080/15305058.2017.1361426

Stefanie A. Wind, E. Wolfe, G. Engelhard, P. Foltz, Mark Rosenstein

引用次数: 11

Differential Distractor Functioning as a Method for Explaining DIF: The Case of a National Admissions Test in Saudi Arabia 差异分散因子作为解释DIF的一种方法——以沙特阿拉伯的一次全国招生考试为例

IF 1.7

International Journal of Testing Pub Date : 2018-01-02 DOI: 10.1080/15305058.2017.1345914

I. Tsaousis, G. Sideridis, Fahad Al-Saawi

{"title":"Differential Distractor Functioning as a Method for Explaining DIF: The Case of a National Admissions Test in Saudi Arabia","authors":"I. Tsaousis, G. Sideridis, Fahad Al-Saawi","doi":"10.1080/15305058.2017.1345914","DOIUrl":"https://doi.org/10.1080/15305058.2017.1345914","url":null,"abstract":"The aim of the present study was to examine Differential Distractor Functioning (DDF) as a means of improving the quality of a measure through understanding biased responses across groups. A DDF analysis could shed light on the potential sources of construct-irrelevant variance by examining whether the differential selection of incorrect choices (distractors), attracts various groups in different ways. To examine possible DDF effects, a method introduced by Penfield (2008, 2010a), based on odds ratio estimators, was utilized. Items from the Chemistry sub-scale of the Standard Achievement Admission Test (SAAT) in Saudi Arabia were used as an example. Statistical evidence for differential item functioning (DIF_ existed for five items, at either moderate or strong levels. Particularly three items (i.e., items 45, 54, and 61), reached category B levels (i.e., moderate DIF), and two items (items 51and 60) category C levels (strong DIF) based on Educational Testing Service guidelines. These items were then examined more closely for DDF in an attempt to potentially understand the causes of DIF and group biased responses. The manuscript concludes with a series of remedial actions, based on distractor-relevant information, with the goal of improving the psychometric properties of an instrument under study.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"1 - 26"},"PeriodicalIF":1.7,"publicationDate":"2018-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2017.1345914","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47606507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9