Jiangang Hao, Alina A. von Davier, Victoria Yaneva, Susan Lottridge, Matthias von Davier, Deborah J. Harris
{"title":"Transforming Assessment: The Impacts and Implications of Large Language Models and Generative AI","authors":"Jiangang Hao, Alina A. von Davier, Victoria Yaneva, Susan Lottridge, Matthias von Davier, Deborah J. Harris","doi":"10.1111/emip.12602","DOIUrl":"10.1111/emip.12602","url":null,"abstract":"<p>The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting-edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely, these innovations raise significant concerns regarding validity, reliability, transparency, fairness, equity, and test security, necessitating careful thinking when applying them in assessments. In this article, we discuss the impacts and implications of LLMs and generative AI on critical dimensions of assessment with example use cases and call for a community effort to equip assessment professionals with the needed AI literacy to harness the potential effectively.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"16-29"},"PeriodicalIF":2.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140589684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size","authors":"Leifeng Xiao, Kit-Tai Hau, Melissa Dan Wang","doi":"10.1111/emip.12604","DOIUrl":"10.1111/emip.12604","url":null,"abstract":"<p>Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the “alpha-if-item-deleted” procedure in scale construction. An item can be removed if alpha increases or decreases by less than .02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than .04 upon its removal. For reliability benchmarks, .80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"74-81"},"PeriodicalIF":2.0,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12604","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140173185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What Mathematics Content Do Teachers Teach? Optimizing Measurement of Opportunities to Learn in the Classroom","authors":"Jiahui Zhang, William H. Schmidt","doi":"10.1111/emip.12603","DOIUrl":"10.1111/emip.12603","url":null,"abstract":"<p>Measuring opportunities to learn (OTL) is crucial for evaluating education quality and equity, but obtaining accurate and comprehensive OTL data at a large scale remains challenging. We attempt to address this issue by investigating measurement concerns in data collection and sampling. With the primary goal of estimating group-level OTLs for large populations of classrooms and the secondary goal of estimating classroom-level OTLs, we propose forming a teacher panel and using an online log-type survey to collect content and time data on sampled days throughout the school year. We compared various sampling schemes in a simulation study with real daily log data from 66 fourth-grade math teachers. The findings from this study indicate that sampling 1 day per week or 1 day every other week provided accurate group-level estimates, while sampling 1 day per week yielded satisfactory classroom-level estimates. The proposed approach aids in effectively monitoring large-scale classroom OTL.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"40-54"},"PeriodicalIF":2.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12603","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140097883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angela Johnson, Elizabeth Barker, Marcos Viveros Cespedes
{"title":"Reframing Research and Assessment Practices: Advancing an Antiracist and Anti-Ableist Research Agenda","authors":"Angela Johnson, Elizabeth Barker, Marcos Viveros Cespedes","doi":"10.1111/emip.12601","DOIUrl":"10.1111/emip.12601","url":null,"abstract":"<p>Educators and researchers strive to build policies and practices on data and evidence, especially on academic achievement scores. When assessment scores are inaccurate for specific student populations or when scores are inappropriately used, even data-driven decisions will be misinformed. To maximize the impact of the research-practice-policy collaborative, every stage of the assessment and research process needs to be critically interrogated. In this paper, we highlight the need to reframe assessment and research for multilingual learners, students with disabilities, and multilingual students with disabilities. We outline a framework that integrates three critical perspectives (QuantCrit, DisCrit, and critical multiculturalism) and discuss how this framework can be applied to assessment creation and research.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 3","pages":"95-105"},"PeriodicalIF":2.7,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ITEMS Corner Update: Two Years of Changes to ITEMS","authors":"Brian C. Leventhal","doi":"10.1111/emip.12596","DOIUrl":"https://doi.org/10.1111/emip.12596","url":null,"abstract":"<p>This issue marks the beginning of the final year of my tenure as editor of the <i>Instructional Topics of Educational Measurement Series (ITEMS)</i>. Although I will save a comprehensive reflection until the last issue of the year, I will use this issue to provide an update on the two changes to ITEMS that were made over the past two years in addition to introducing the newest entry to the ITEMS digital library.</p><p>In 2022, I took over a newly created format and process for <i>ITEMS</i> modules—digital interactive teaching modules as opposed to the traditional PDF format. Modules were hosted on a learning management system (LMS), shifting focus from a journal publication to a modernized teaching-focused platform. While the LMS provided ample opportunity for expansion and learning, its form was challenging to navigate given its unique look and feel as a standalone website, separate from the then recently revitalized modern NCME website. Thus, my first significant change to <i>ITEMS</i> was to migrate the <i>ITEMS</i> portal LMS to the NCME website. Learners now go straight to the NCME website and navigate directly to the ITEMS portal without requiring a unique username and password. This makes the <i>ITEMS</i> module library more accessible and makes the NCME website the go-to destination for professional development in educational measurement.</p><p>Prior to 2022, specialized software was necessary to produce digital modules. This software allowed for the inclusion of interactive and nonlinear learning. This process resulted in beautiful, intricate, and pleasing modules for learning, but was complex and costly for development. To continue the process in place would have required producing these modules as a full-time job with a staff—a true testament to the work of the previous editor André Rupp and other volunteers. Fortunately, the <i>ITEMS</i> portal being hosted on the NCME website offered an opportunity to modify the development process while keeping an interactive component with a modernized look and feel.</p><p>I developed a new comprehensive system for creating <i>ITEMS</i> modules. Authors now develop content for four to five sections using pre-made PowerPoint templates, a familiar software to many. Once completed, I insert animations to align with timing of the recorded audio and export them as videos. These videos are hosted on the NCME website and can be downloaded for offline viewing (an extra benefit). Learners interact directly with the videos and content on the website, where module sections may be viewed in any order. In addition to video content, interactive activities that exemplify syntax or case studies are developed to assist with learning. Finally, interactive selected response learning checks are made, allowing learners to check their understanding as they traverse through the module. Datasets, syntax, and other files are hosted on the module page for easy download.</p><p>The <i>ITEMS</i> development process is not l","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 1","pages":"96"},"PeriodicalIF":2.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12596","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: High School Coursetaking Sequence Clusters and Postsecondary Enrollment","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12597","DOIUrl":"https://doi.org/10.1111/emip.12597","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 1","pages":"4"},"PeriodicalIF":2.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987388","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nathan Dadey, Brian Gong, Yun-Kyung Kim, Edynn Sato
{"title":"Digital Module 35: Through-Year Assessment","authors":"Nathan Dadey, Brian Gong, Yun-Kyung Kim, Edynn Sato","doi":"10.1111/emip.12595","DOIUrl":"https://doi.org/10.1111/emip.12595","url":null,"abstract":"<div>\u0000 \u0000 <section>\u0000 \u0000 <h3> Module Abstract</h3>\u0000 \u0000 <p><i>Through-year assessments</i> are assessments that are administered in multiple parts and at different times over the course of a school year that also produce summative scores that can be used with state accountability systems (Lorié et al., 2021; Dadey & Gong, 2023). These assessments are alternatively known as instructionally embedded, through-course, or periodic assessments. There are a number of possible through-year assessment models, and they have recently been the subject of much policy interest as they have the potential to inform subsequent instruction, be more closely aligned with and responsive to curricula and instruction, provide more proximal measures of learning, and be a more sensitive measure of student progress or growth than typical year-end summative assessments (Clark & Karvonen, 2021; Gong, 2021; NWEA, 2021; Wise, 2011). More research is needed, however, to substantiate these potential uses.</p>\u0000 </section>\u0000 </div>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 1","pages":"97-98"},"PeriodicalIF":2.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139987433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Workflow for Minimizing Errors in Template-Based Automated Item-Generation Development","authors":"Yanyan Fu","doi":"10.1111/emip.12600","DOIUrl":"10.1111/emip.12600","url":null,"abstract":"<p>The template-based automated item-generation (TAIG) approach that involves template creation, item generation, item selection, field-testing, and evaluation has more steps than the traditional item development method. Consequentially, there is more margin for error in this process, and any template errors can be cascaded to the generated items. Therefore, it is essential to eliminate the source of errors and ensure the quality of the template so items can be problem-free. The article introduces a process to reduce template errors at the early stage of template development, minimize the impact of template errors on generated items, and increase the survival rates of generated items. The article also discusses a statistical method to establish confidence in the quality of the template by systematically examining the quality of the generated items. The proposed method can reduce the review process for some items generated from a template.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"30-39"},"PeriodicalIF":2.0,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139760582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The University of California Was Wrong to Abolish the SAT: Admissions When Affirmative Action Was Banned","authors":"Donald Wittman","doi":"10.1111/emip.12598","DOIUrl":"10.1111/emip.12598","url":null,"abstract":"<p>I study student characteristics and academic performance at the University of California, where consideration of an applicant's ethnicity has been banned since 1996 and SAT scores were used in admitting students to the university until fall 2021. I show the following: (1) SAT scores were more important than high school grades in predicting first-year university GPA; (2) the use of SAT scores alone or with high school grades in determining admission is biased in favor of admitting underrepresented minorities and students who are socioeconomically disadvantaged; (3) SAT scores are more important and high school grades are less important in predicting GPA for underrepresented minorities and/or those students from low-income families than they are for those students who are white and/or from high-income families; and (4) the University of California found ways to admit a significant number of underrepresented minorities despite many of them having low SAT scores.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 2","pages":"55-63"},"PeriodicalIF":2.0,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12598","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139771065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}