{"title":"Migration Background in PISA’s Measure of Social Belonging: Using a Diffractive Lens to Interpret Multi-Method DIF Studies","authors":"Nathan D. Roberson, B. Zumbo","doi":"10.1080/15305058.2019.1632316","DOIUrl":"https://doi.org/10.1080/15305058.2019.1632316","url":null,"abstract":"This paper investigates measurement invariance as it relates to migration background using the Program for International Student Assessment measure of social belonging. We explore how the use of two measurement invariance techniques provide insights into differential item functioning using the alignment method in conjunction with logistic regression in the case of multiple group comparisons. Social belonging is a central human need, and we argue that immigration background is important factor when considering how an individual interacts with a survey/items about belonging. Overall results from both the alignment method and ordinal logistic regression, interpreted through a diffractive lens, suggest that it is inappropriate to treat peoples of four different immigration backgrounds within the countries analyzed as exchangeable groups.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1632316","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44180342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic Multistage Testing: A Highly Efficient and Regulated Adaptive Testing Method","authors":"Xiao Luo, Xinrui Wang","doi":"10.1080/15305058.2019.1621871","DOIUrl":"https://doi.org/10.1080/15305058.2019.1621871","url":null,"abstract":"This study introduced dynamic multistage testing (dy-MST) as an improvement to existing adaptive testing methods. dy-MST combines the advantages of computerized adaptive testing (CAT) and computerized adaptive multistage testing (ca-MST) to create a highly efficient and regulated adaptive testing method. In the test construction phase, multistage panels are assembled using similar design principles and assembly techniques with ca-MST. In the administration phase, items are adaptively administered from a dynamic interim pool. A large-scale simulation study was conducted to evaluate the merits of dy-MST, and it found that dy-MST significantly reduced test length while maintaining the identical classification accuracy with the full-length tests and meeting all content requirements effectively. Psychometrically, the testing efficiency in dy-MST was comparable to CAT. Operationally, dy-MST allows for holistic pre-administration management of test content directly at the test level. Thus, dy-MST is deemed appropriate for delivering adaptive tests with high efficiency and well-controlled content.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1621871","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48949313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Comparison of Methods for Detecting Examinee Preknowledge of Items","authors":"Xi Wang, Yang Liu, F. Robin, Hongwen Guo","doi":"10.1080/15305058.2019.1610886","DOIUrl":"https://doi.org/10.1080/15305058.2019.1610886","url":null,"abstract":"In an on-demand testing program, some items are repeatedly used across test administrations. This poses a risk to test security. In this study, we considered a scenario wherein a test was divided into two subsets: one consisting of secure items and the other consisting of possibly compromised items. In a simulation study of multistage adaptive testing, we used three methods to detect item preknowledge: a predictive checking method (PCM), a likelihood ratio test (LRT), and an adapted Kullback–Leibler divergence (KLD-A) test. We manipulated four factors: the proportion of compromised items, the stage of adaptive testing at which preknowledge was present, item-parameter estimation error, and the information contained in secure items. The type I error results indicated that the LRT and PCM methods are favored over the KLD-A method because the KLD-A can experience large inflated type I error in many conditions. In regard to power, the LRT and PCM methods displayed a wide range of results, generally from 0.2 to 0.8, depending on the amount of preknowledge and the stage of adaptive testing at which the preknowledge was present.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1610886","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45510975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic Classification Models: Recent Developments, Practical Issues, and Prospects","authors":"Hamdollah Ravand, Purya Baghaei","doi":"10.1080/15305058.2019.1588278","DOIUrl":"https://doi.org/10.1080/15305058.2019.1588278","url":null,"abstract":"More than three decades after their introduction, diagnostic classification models (DCM) do not seem to have been implemented in educational systems for the purposes they were devised. Most DCM research is either methodological for model development and refinement or retrofitting to existing nondiagnostic tests and, in the latter case, basically for model demonstration or constructs identification. DCMs have rarely been used to develop diagnostic assessment right from the start with the purpose of identifying individuals’ strengths and weaknesses (referred to as true applications in this study). In this article, we give an introduction to DCMs and their latest developments along with guidelines on how to proceed to employ DCMs to develop a diagnostic test or retrofit to a nondiagnostic assessment. Finally, we enumerate the reasons why we believe DCMs have not become fully operational in educational systems and suggest some advice to make their advent smooth and quick.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1588278","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44424368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Snow, Daisy W. Rutstein, Satabdi Basu, M. Bienkowski, H. Everson
{"title":"Leveraging Evidence-Centered Design to Develop Assessments of Computational Thinking Practices","authors":"E. Snow, Daisy W. Rutstein, Satabdi Basu, M. Bienkowski, H. Everson","doi":"10.1080/15305058.2018.1543311","DOIUrl":"https://doi.org/10.1080/15305058.2018.1543311","url":null,"abstract":"Computational thinking is a core skill in computer science that has become a focus of instruction in primary and secondary education worldwide. Since 2010, researchers have leveraged Evidence-Centered Design (ECD) methods to develop measures of students’ Computational Thinking (CT) practices. This article describes how ECD was used to develop CT assessments for primary students in Hong Kong and secondary students in the United States. We demonstrate how leveraging ECD yields a principled design for developing assessments of hard-to-assess constructs and, as part of the process, creates reusable artifacts—design patterns and task templates—that inform the design of other, related assessments. Leveraging ECD, as described in this article, represents a principled approach to measuring students’ computational thinking practices, and situates the approach in emerging computational thinking curricula and programs to emphasize the links between curricula and assessment design.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1543311","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41919857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Performance Tasks within Simulated Environments to Assess Teachers’ Ability to Engage in Coordinated, Accumulated, and Dynamic (CAD) Competencies","authors":"Jamie N. Mikeska, Heather Howell, C. Straub","doi":"10.1080/15305058.2018.1551223","DOIUrl":"https://doi.org/10.1080/15305058.2018.1551223","url":null,"abstract":"The demand for assessments of competencies that require complex human interaction is steadily growing as we move toward a focus on twenty-first century skills. As assessment designers aim to address this demand, we argue for the importance of a common language to understand and attend to the key challenges implicated in designing task situations to assess such competencies. We offer the descriptors coordinated, accumulated, and dynamic (CAD) as a way of understanding the nature of these competencies and the considerations involved in measuring them. We use an example performance task designed to measure teacher competency in leading an argumentation-focused discussion in elementary science to illustrate what we mean by the coordinated, accumulated, and dynamic nature of this construct and the challenges assessment designers face when developing performance tasks to measure this construct. Our work is unique in that we designed these performance tasks to be deployed within a digital simulated classroom environment that includes simulated students controlled by a human agent, known as the simulation specialist. We illustrate what we mean by these three descriptors and discuss how we addressed various considerations in our task design to assess elementary science teachers’ ability to facilitate argumentation-focused discussions.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1551223","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45227236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introduction to “Challenges and Opportunities in the Design of ‘Next-Generation Assessments of 21st Century Skills’” Special Issue","authors":"M. Oliveri, R. Mislevy","doi":"10.1080/15305058.2019.1608551","DOIUrl":"https://doi.org/10.1080/15305058.2019.1608551","url":null,"abstract":"We are pleased to introduce this special issue of the International Journal of Testing (IJT), on the theme “Challenges and Opportunities in the Design of ‘Next-Generation Assessments of 21 Century Skills.’” Our call elicited manuscripts related to evidence-based models or tools that facilitate the scalability of the design, development, and implementation of new forms of assessment. The articles sought to address topics beyond familiar tools and processes, such as automated scoring, in order to consider issues focusing on assessment architecture and assessment engineering models, with simulated learning and performance contexts, new item types, and steps taken to ensure reliability and validity. The issue’s aims are to enrich our understanding of what has worked well, why, and lessons learned, in order to strengthen future conceptualization and design of next-generation assessments (NGAs). We received a number of submissions, which do just that. The five pieces that constitute this issue were selected not only for their individual contributions but also because collectively, they illustrate broader principles and complement each other in their emphases. The articles illustrate lessons learned in current applications and provide insights to guide implementation in future extensions. Next, we offer thoughts on the challenges and opportunities stated in the call and the role of principled frameworks for the design of NGAs. A good place to begin a discussion of assessment design is Messick’s (1994) three-sentence description of the backbone of the underlying assessment argument:","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1608551","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46381175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Use of Evidence-Centered Design to Develop Learning Maps-Based Assessments","authors":"A. Clark, Meagan Karvonen","doi":"10.1080/15305058.2018.1543310","DOIUrl":"https://doi.org/10.1080/15305058.2018.1543310","url":null,"abstract":"Evidence-based approaches to assessment design, development, and administration provide a strong foundation for an assessment’s validity argument but can be time consuming, resource intensive, and complex to implement. This article describes an evidence-based approach used for one assessment that addresses these challenges. Evidence-centered design principles were applied to create a task template to support test development for a new, instructionally embedded, large-scale alternate assessment system used for accountability purposes in 18 US states. Example evidence from the validity argument is presented to evaluate the effectiveness of the template as an evidence-based method for test development. Lessons learned, including strengths and challenges, are shared to inform test-development efforts for other programs.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2018.1543310","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41489319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Ontologies for Assessing Collaborative Problem Solving Skills","authors":"Jessica Andrews-Todd, Deirdre Kerr","doi":"10.1080/15305058.2019.1573823","DOIUrl":"https://doi.org/10.1080/15305058.2019.1573823","url":null,"abstract":"Abstract Collaborative problem solving (CPS) has been deemed a critical twenty-first century competency for a variety of contexts. However, less attention has been given to work aimed at the assessment and acquisition of such capabilities. Recently large scale efforts have been devoted toward assessing CPS skills, but there are no agreed upon guiding principles for assessment of this complex construct, particularly for assessment in digital performance situations. There are notable challenges in conceptualizing the complex construct and extracting evidence of CPS skills from large streams of data in digital contexts such as games and simulations. In the current paper, we discuss how the in-task assessment framework (I-TAF), a framework informed by evidence-centered design, can provide guiding principles for the assessment of CPS in these contexts. We give specific attention to one aspect of I-TAF, ontologies, and describe how they can be used to instantiate the student model in evidence-centered design which lays out what we wish to measure in a principled way. We further discuss how ontologies can serve as an anchor representation for other components of assessment such as scoring rubrics, evidence identification, and task design.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1573823","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46252638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
O. Zlatkin‐Troitschanskaia, Christiane Kuhn, S. Brückner, Jacqueline P. Leighton
{"title":"Evaluating a Technology-Based Assessment (TBA) to Measure Teachers’ Action-Related and Reflective Skills","authors":"O. Zlatkin‐Troitschanskaia, Christiane Kuhn, S. Brückner, Jacqueline P. Leighton","doi":"10.1080/15305058.2019.1586377","DOIUrl":"https://doi.org/10.1080/15305058.2019.1586377","url":null,"abstract":"Teaching performance can be assessed validly only if the assessment involves an appropriate, authentic representation of real-life teaching practices. Different skills interact in coordinating teachers’ actions in different classroom situations. Based on the evidence-centered design model, we developed a technology-based assessment framework that enables differentiation between two essential teaching actions: action-related skills and reflective skills. Action-related skills are necessary to handle specific subject-related situations during instruction. Reflective skills are necessary to prepare and evaluate specific situations in pre- and postinstructional phases. In this article, we present the newly developed technology-based assessment to validly measure teaching performance, and we discuss validity evidence from cognitive interviews with teachers (novices and experts) using the think-aloud method, which indicates that the test takers’ respective mental processes when solving action-related skills tasks are consistent with the theoretically assumed knowledge and skill components and depend on the different levels of teaching expertise.","PeriodicalId":46615,"journal":{"name":"International Journal of Testing","volume":null,"pages":null},"PeriodicalIF":1.7,"publicationDate":"2019-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/15305058.2019.1586377","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49063537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}