{"title":"Validity: An Integrated Approach to Test Score Meaning and Use, by Gregory J. Cizek, New York, Routledge, 2020, 190 pp., 55.00 (Paperback)","authors":"Tony Albano","doi":"10.1080/08957347.2023.2274570","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274570","url":null,"abstract":"Published in Applied Measurement in Education (Ahead of Print, 2023)","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recruitment and Retention of Racially and Ethnically Minoritized Graduate Students in Educational Measurement Programs","authors":"Jennifer Randall, Joseph Rios","doi":"10.1080/08957347.2023.2274565","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274565","url":null,"abstract":"Building on the extant literature on recruitment and retention within the field of STEM and undergraduate education, we sought to explore the recruitment and retention experiences of racially and e...","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138495009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Detecting Item Parameter Drift in Small Sample Rasch Equating","authors":"Daniel Jurich, Chunyan Liu","doi":"10.1080/08957347.2023.2274567","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274567","url":null,"abstract":"ABSTRACTScreening items for parameter drift helps protect against serious validity threats and ensure score comparability when equating forms. Although many high-stakes credentialing examinations operate with small sample sizes, few studies have investigated methods to detect drift in small sample equating. This study demonstrates that several newly researched drift detection strategies can improve equating accuracy under certain conditions with small samples where some anchor items display item parameter drift. Results showed that the recently proposed methods mINFIT and mOUTFIT as well as the more conventional Robust-z helped mitigate the adverse effects of drifting anchor items in conditions with higher drift levels or with more than 75 examinees. In contrast, the Logit Difference approach excessively removed invariant anchor items. The discussion provides recommendations on how practitioners working with small samples can use the results to make more informed decisions regarding item parameter drift. Disclosure statementNo potential conflict of interest was reported by the author(s).Supplementary materialSupplemental data for this article can be accessed online at https://doi.org/10.1080/08957347.2023.2274567Notes1 In certain testing designs, some items may be reused as non-anchor items on future forms. Although IPD can occur on those items, we use the traditional IPD definition as specific to differential functioning in the items reused to serve as the equating anchor set.2 In IRT, the old form anchor item parameter estimates can also come from a pre-calibrated bank. However, we use the old and new form terminology as the simulation design involves directly equating to a previous form.3 For example, assume an item drifted in the 1.0 magnitude condition from b = 0 to 1 between Forms 1 and 2, this item would be treated as having a true b of 1.0 if selected on the Form 3.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135392677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Logistic Regression: A New Method to Calibrate Pretest Items in Multistage Adaptive Testing","authors":"TsungHan Ho","doi":"10.1080/08957347.2023.2274572","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274572","url":null,"abstract":"ABSTRACTAn operational multistage adaptive test (MST) requires the development of a large item bank and the effort to continuously replenish the item bank due to concerns about test security and validity over the long term. New items should be pretested and linked to the item bank before being used operationally. The linking item volume fluctuations in MST, however, bring into question the quality of the link to the reference scale. In this study, various calibration/linking methods along with a newly proposed Bayesian logistic regression (BLR) method were evaluated by comparison with the test characteristic curve method through simulated MST response data in terms of item parameter recovery. Results generated by the BLR method were promising due to its estimation stability and robustness across studied conditions. The findings of the present study should help inform practitioners of the utilities of implementing the pretest item calibration method in MST. Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135392506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven L. Wise, G. Gage Kingsbury, Meredith L. Langi
{"title":"Change in Engagement During Test Events: An Argument for Weighted Scoring?","authors":"Steven L. Wise, G. Gage Kingsbury, Meredith L. Langi","doi":"10.1080/08957347.2023.2274568","DOIUrl":"https://doi.org/10.1080/08957347.2023.2274568","url":null,"abstract":"ABSTRACTRecent research has provided evidence that performance change during a student’s test event can indicate the presence of test-taking disengagement. Meaningful performance change implies that some portions of the test event reflect assumed maximum performance better than others and, because disengagement tends to diminish performance, lower-performing portions are less likely to reflect maximum performance than higher-performing portions. This empirical study explored the use of differential weighting of item responses during scoring, with weighting schemes representing either declining or increasing performance. Results indicated that weighted scoring could substantially decrease the score distortion due to disengagement factors and thereby improve test score validity. The study findings support the use of scoring procedures that manage disengagement by adapting to student test-taking behavior. Disclosure statementThe authors have no known conflicts of interest to disclose.Notes1 What constitutes “construct-irrelevant” depends on how the target construct is conceptualized. For example, Borgonovi and Biecek (Citation2016) argued that academic endurance should be considered part of what PISA is intended to measure, because academic endurance is positively associated with a student’s success later in life. It is unclear, however, how universally this conceptualization is adopted by those interpreting PISA results.2 Such comparisons between first and second half test performance require the assumption that the two halves are reasonably equivalent in terms of content representation if IRT-based scoring is used.3 Half test MLE standard errors in Math and Reading were around 4.2 and 4.8, respectively.4 These intervals are not intended to correspond to the critical regions used to assess statistical significance under the AMC method. For example, classifying PD < -10 points as a large decline represents a less conservative criterion than the critical region used by Wise and Kingsbury (Citation2022).","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135634751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Promise of Assessments That Advance Social Justice: An Indigenous Example","authors":"Pōhai Kūkea Shultz, Kerry S. Englert","doi":"10.1080/08957347.2023.2222031","DOIUrl":"https://doi.org/10.1080/08957347.2023.2222031","url":null,"abstract":"ABSTRACT In the United States, systemic racism against people of color was brought to the forefront of discourse throughout 2020, and highlighted the on-going inequities faced by intentionally marginalized groups in policing, health and education. No community of color is immune from these inequities, and the activism in 2020 and the consequences of the pandemic have made systemic inequities impossible to ignore. In the Hawaiʻi context, social and racial injustice has resulted in cultural and language loss (among other markers of colonization), but it is within this loss that we can see the potential for the most significant evolution of assessment practices that champion self determination and social justice. We illustrate how injustices can be addressed through the development of assessments centered in advocacy of and accountability to our communities of color. It is time for us to reimagine what self-determination and social justice in all assessment systems can and should look like.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45164648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Standards Will Never Be Enough: A Racial Justice Extension","authors":"Mya Poe, M. Oliveri, N. Elliot","doi":"10.1080/08957347.2023.2214656","DOIUrl":"https://doi.org/10.1080/08957347.2023.2214656","url":null,"abstract":"ABSTRACT Since 1952, the Standards for Educational and Psychological Testing has provided criteria for developing and evaluating educational and psychological tests and testing practice. Yet, we argue that the foundations, operations, and applications in the Standards are no longer sufficient to meet the current U.S. testing demands for fairness for all test takers. We propose racial justice extensions as principled ways to extend the Standards, through intentional actions focused on race and targeted at educational policies, processes, and outcomes in specific settings. To inform these extensions, we focus on four social-justice concepts: intersectionality derived from Black Feminist Theory; responsibility derived from moral philosophy; disparate impact derived from legal reasoning; and situatedness derived from social learning theories. We demonstrate these extensions and concepts in action by applying them to case studies of nursing licensure and placement testing.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44476682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shifting Educational Measurement from an Agent of Systemic Racism to an Anti-Racist Endeavor","authors":"Michaeline Russell","doi":"10.1080/08957347.2023.2217555","DOIUrl":"https://doi.org/10.1080/08957347.2023.2217555","url":null,"abstract":"ABSTRACT In recent years, issues of race, racism and social justice have garnered increased attention across the nation. Although some aspects of social justice, particularly cultural sensitivity and test bias, have received similar attention within the field of educational measurement, sharp focus of racism has alluded the field. This manuscript focuses narrowly on racism. Drawing on an expansive body of work in the field of sociology, several key theories of race and racism advanced over the past century are presented. Elements of these theories are then integrated into a model of systemic racism. This model is used to identify some of the ways in which educational measurement supports systemic racism as it operates in the United States. I then explore ways in which an anti-racist frame could be applied to combat the system of racism and reorient our work to support racial liberation.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47875297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eowyn P. O’Dwyer, Jesse R. Sparks, Leslie Nabors Oláh
{"title":"Enacting a Process for Developing Culturally Relevant Classroom Assessments","authors":"Eowyn P. O’Dwyer, Jesse R. Sparks, Leslie Nabors Oláh","doi":"10.1080/08957347.2023.2214652","DOIUrl":"https://doi.org/10.1080/08957347.2023.2214652","url":null,"abstract":"ABSTRACT A critical aspect of the development of culturally relevant classroom assessments is the design of tasks that affirm students’ racial and ethnic identities and community cultural practices. This paper describes the process we followed to build a shared understanding of what culturally relevant assessments are, to pursue ways of bringing more diverse voices and perspectives into the development process to generate new ideas and further our understanding, and finally to integrate those understandings and findings into the design of scenario-based tasks (ETS Testlets). This paper describes our engagement with research literature and employee-led affinity groups, students, and external consultants. In synthesizing their advice and feedback, we identified five design principles that scenario-based assessment developers can incorporate into their own work. These principles are then applied to the development of a scenario-based assessment task. Finally, we reflect on our process and challenges faced to inform future advancements in the field.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49204043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying a Culturally Responsive Pedagogical Framework to Design and Evaluate Classroom Performance-Based Assessments in Hawai‘i","authors":"Carla M. Evans","doi":"10.1080/08957347.2023.2214655","DOIUrl":"https://doi.org/10.1080/08957347.2023.2214655","url":null,"abstract":"ABSTRACT Previous writings focus on why centering assessment design around students’ cultural, social, and/or linguistic diversity is important and how performance-based assessment can support such aims. This article extends previous work by describing how a culturally responsive classroom assessment framework was created from a culturally responsive education (CRE) pedagogical framework. The goal of the framework was to guide the design and evaluation of curriculum-embedded, classroom performance assessments. Components discussed include: modification of evidence-centered design processes, teacher and/or student adaptation of construct irrelevant aspects of task prompts, addition of cultural meaningfulness questions to think alouds, and revision of task quality review protocols to promote CRE design features. Future research is needed to explore the limitations of the framework applied, and the extent to which students perceive the classroom summative assessments designed do indeed allow them to better show all they know and can do in ways related to their cultural, social, and/or linguistic identities.","PeriodicalId":51609,"journal":{"name":"Applied Measurement in Education","volume":null,"pages":null},"PeriodicalIF":1.5,"publicationDate":"2023-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46027666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}