International Journal of Testing最新文献_第9页

An Examination of Different Methods of Setting Cutoff Values in Person Fit Research 对适合度研究中设定临界值的不同方法的考察

IF 1.7

International Journal of Testing Pub Date : 2019-01-02 DOI: 10.1080/15305058.2018.1464010

A. Mousavi, Ying Cui, Todd Rogers

{"title":"An Examination of Different Methods of Setting Cutoff Values in Person Fit Research","authors":"A. Mousavi, Ying Cui, Todd Rogers","doi":"10.1080/15305058.2018.1464010","DOIUrl":"https://doi.org/10.1080/15305058.2018.1464010","url":null,"abstract":"This simulation study evaluates four different methods of setting cutoff values for person fit assessment, including (a) using fixed cutoff values either from theoretical distributions of person fit statistics, or arbitrarily chosen by the researchers in the literature; (b) using the specific percentile rank of empirical sampling distribution of person fit statistics from simulated fitting responses; (c) using bootstrap method to estimate cutoff values of empirical sampling distribution of person fit statistics from simulated fitting responses; and (d) using the p-value methods to identify misfitting responses conditional on ability levels. The Snijders' (2001), as an index with known theoretical distribution, van der Flier's U3 (1982) and Sijtsma's HT coefficient (1986), as indices with unknown theoretical distribution, were chosen. According to the simulation results, different methods of setting cutoff values tend to produce different levels of Type I error and detection rates, indicating it is critical to select an appropriate method for setting cutoff values in person fit research.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"19 1","pages":"1 - 22"},"PeriodicalIF":1.7,"publicationDate":"2019-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1464010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48532510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

A Comparison of the Relative Performance of Four IRT Models on Equating Passage-Based Tests 四种IRT模型在等价段落测试中的相对性能比较

IF 1.7

International Journal of Testing Pub Date : 2018-12-13 DOI: 10.1080/15305058.2018.1530239

Kyung Yong Kim, Euijin Lim, Won‐Chan Lee

{"title":"A Comparison of the Relative Performance of Four IRT Models on Equating Passage-Based Tests","authors":"Kyung Yong Kim, Euijin Lim, Won‐Chan Lee","doi":"10.1080/15305058.2018.1530239","DOIUrl":"https://doi.org/10.1080/15305058.2018.1530239","url":null,"abstract":"For passage-based tests, items that belong to a common passage often violate the local independence assumption of unidimensional item response theory (UIRT). In this case, ignoring local item dependence (LID) and estimating item parameters using a UIRT model could be problematic because doing so might result in inaccurate parameter estimates, which, in turn, could impact the results of equating. Under the random groups design, the main purpose of this article was to compare the relative performance of the three-parameter logistic (3PL), graded response (GR), bifactor, and testlet models on equating passage-based tests when various degrees of LID were present due to passage. Simulation results showed that the testlet model produced the most accurate equating results, followed by the bifactor model. The 3PL model worked as well as the bifactor and testlet models when the degree of LID was low but returned less accurate equating results than the two multidimensional models as the degree of LID increased. Among the four models, the polytomous GR model provided the least accurate equating results.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"19 1","pages":"248 - 269"},"PeriodicalIF":1.7,"publicationDate":"2018-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1530239","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46453114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Test Instructions Do Not Moderate the Indirect Effect of Perceived Test Importance on Test Performance in Low-Stakes Testing Contexts 在低风险测试环境中，测试说明不能缓和感知测试重要性对测试性能的间接影响

IF 1.7

International Journal of Testing Pub Date : 2018-10-02 DOI: 10.1080/15305058.2017.1396466

S. Finney, Aaron J. Myers, C. Mathers

{"title":"Test Instructions Do Not Moderate the Indirect Effect of Perceived Test Importance on Test Performance in Low-Stakes Testing Contexts","authors":"S. Finney, Aaron J. Myers, C. Mathers","doi":"10.1080/15305058.2017.1396466","DOIUrl":"https://doi.org/10.1080/15305058.2017.1396466","url":null,"abstract":"Assessment specialists expend a great deal of energy to promote valid inferences from test scores gathered in low-stakes testing contexts. Given the indirect effect of perceived test importance on test performance via examinee effort, assessment practitioners have manipulated test instructions with the goal of increasing perceived test importance. Importantly, no studies have investigated the impact of test instructions on this indirect effect. In the current study, students were randomly assigned to one of three test instruction conditions intended to increase test relevance while keeping the test low-stakes to examinees. Test instructions did not impact average perceived test importance, examinee effort, or test performance. Furthermore, the indirect relationship between importance and performance via effort was not moderated by instructions. Thus, the effect of perceived test importance on test scores via expended effort appears consistent across different messages regarding the personal relevance of the test to examinees. The main implication for testing practice is that the effect of instructions may be negligible when reflective of authentic low-stakes test score use. Future studies should focus on uncovering instructions that increase the value of performance to the examinee yet remain truthful regarding score use.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"297 - 322"},"PeriodicalIF":1.7,"publicationDate":"2018-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2017.1396466","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49293024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 14

Investigating the Reliability of the Sentence Verification Technique 句子验证技术的可靠性研究

IF 1.7

International Journal of Testing Pub Date : 2018-09-20 DOI: 10.1080/15305058.2018.1497636

Amanda M Marcotte, Francis Rick, C. Wells

引用次数: 2

Item Parameter Drift in Context Questionnaires from International Large-Scale Assessments 国际大型评估问卷中项目参数的漂移

IF 1.7

International Journal of Testing Pub Date : 2018-09-14 DOI: 10.1080/15305058.2018.1481852

HyeSun Lee, K. Geisinger

{"title":"Item Parameter Drift in Context Questionnaires from International Large-Scale Assessments","authors":"HyeSun Lee, K. Geisinger","doi":"10.1080/15305058.2018.1481852","DOIUrl":"https://doi.org/10.1080/15305058.2018.1481852","url":null,"abstract":"The purpose of the current study was to examine the impact of item parameter drift (IPD) occurring in context questionnaires from an international large-scale assessment and determine the most appropriate way to address IPD. Focusing on the context of psychometric and educational research where scores from context questionnaires composed of polytomous items were employed for the classification of examinees, the current research investigated the impacts of IPD on the estimation of questionnaire scores and classification accuracy with five manipulated factors: the length of a questionnaire, the proportion of items exhibiting IPD, the direction and magnitude of IPD, and three decisions about IPD. The results indicated that the impact of IPD occurring in a short context questionnaire on the accuracy of score estimation and classification of examinees was substantial. The accuracy in classification considerably decreased especially at the lowest and highest categories of a trait. Unlike the recommendation from literature in educational testing, the current study demonstrated that keeping items exhibiting IPD and removing them only for transformation were appropriate when IPD occurred in relatively short context questionnaires. Using 2011 TIMSS data from Iran, an applied example demonstrated the application of provided guidance in making appropriate decisions about IPD.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"19 1","pages":"23 - 51"},"PeriodicalIF":1.7,"publicationDate":"2018-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1481852","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42801965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Investigating the Comparability of Examination Difficulty Using Comparative Judgement and Rasch Modelling 运用比较判断和Rasch模型研究考试难度的可比性

IF 1.7

International Journal of Testing Pub Date : 2018-09-14 DOI: 10.1080/15305058.2018.1486316

Stephen D. Holmes, M. Meadows, I. Stockford, Qingping He

引用次数: 3

Analyzing Job Analysis Data Using Mixture Rasch Models 使用混合Rasch模型分析工作分析数据

IF 1.7

International Journal of Testing Pub Date : 2018-09-14 DOI: 10.1080/15305058.2018.1481853

Adam E. Wyse

引用次数: 5

A Polytomous Model of Cognitive Diagnostic Assessment for Graded Data 分级数据认知诊断评估的多元体模型

IF 1.7

International Journal of Testing Pub Date : 2018-07-03 DOI: 10.1080/15305058.2017.1396465

Dongbo Tu, Chanjin Zheng, Yan Cai, Xuliang Gao, Daxun Wang

引用次数: 10

Investigating How Test-Takers Change Their Strategies to Handle Difficulty in Taking a Reading Comprehension Test: Implications for Score Validation 调查应试者如何改变策略来应对阅读理解测试中的困难：对分数验证的启示

IF 1.7

International Journal of Testing Pub Date : 2018-07-03 DOI: 10.1080/15305058.2017.1396464

Amery Wu, Michelle Y. Chen, J. Stone

引用次数: 9

Incongruence Between Native and Test Administration Languages: Towards Equal Opportunity in International Literacy Assessment 母语与考试管理语言的不一致:迈向国际读写能力评估的机会均等

IF 1.7

International Journal of Testing Pub Date : 2018-07-03 DOI: 10.1080/15305058.2017.1407767

Patriann Smith, P. Frazier, Jaehoon Lee, R. Chang

{"title":"Incongruence Between Native and Test Administration Languages: Towards Equal Opportunity in International Literacy Assessment","authors":"Patriann Smith, P. Frazier, Jaehoon Lee, R. Chang","doi":"10.1080/15305058.2017.1407767","DOIUrl":"https://doi.org/10.1080/15305058.2017.1407767","url":null,"abstract":"Previous research has primarily addressed the effects of language on the Program for International Student Assessment (PISA) mathematics and science assessments. More recent research has focused on the effects of language on PISA reading comprehension and literacy assessments on student populations in specific Organization for Economic Cooperation and Development (OECD) and non-OECD countries. Recognizing calls to highlight the impact of language on student PISA reading performance across countries, the purpose of this study was to examine the effect of home languages versus test languages on PISA reading literacy across OECD and non-OECD economies, while considering other factors. The results of Ordinary Least Squares regression showed that about half of the economies demonstrated a positive and significant effect of students' language status on their reading performance. This finding is consistent with observations in the parallel analysis of PISA 2009 data, suggesting that students' performance on reading literacy assessment was higher when they were tested in their home language. Our findings highlight the importance of the role of context, the need for new approaches to test translation, and the potential similarities in language status for youth from OECD and non-OECD countries that have implications for interpreting their PISA reading literacy assessments.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":"18 1","pages":"276 - 296"},"PeriodicalIF":1.7,"publicationDate":"2018-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2017.1407767","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41346188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5