{"title":"Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms","authors":"Stella Y. Kim, Sungyeun Kim","doi":"10.1111/emip.12671","DOIUrl":"https://doi.org/10.1111/emip.12671","url":null,"abstract":"<p>This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A total of 74 students participated in this study and responded to AIG-based test forms. Then, we analyzed the data using four distinct designs based on the data collection design, and conceptualization of true scores and measurement conditions over hypothetical replications. This study also examined the theoretical relationships among the four data collection designs and highlighted the potential impact of confounding between item templates and item clones.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"20-31"},"PeriodicalIF":2.7,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144118008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mo Zhang, Paul Deane, Andrew Hoang, Hongwen Guo, Chen Li
{"title":"Applications and Modeling of Keystroke Logs in Writing Assessments","authors":"Mo Zhang, Paul Deane, Andrew Hoang, Hongwen Guo, Chen Li","doi":"10.1111/emip.12668","DOIUrl":"https://doi.org/10.1111/emip.12668","url":null,"abstract":"<p>In this paper, we describe two empirical studies that demonstrate the application and modeling of keystroke logs in writing assessments. We illustrate two different approaches of modeling differences in writing processes: analysis of mean differences in handcrafted theory-driven features and use of large language models to identify stable personal characteristics. In the first study, we examined the effects of test environment on writing characteristics: at-home versus in-center, using features extracted from keystroke logs. In a second study, we explored ways to measure stable personal characteristics and traits. As opposed to feature engineering that can be difficult to scale, raw keystroke logs were used as input in the second study, and large language models were developed to infer latent relations in the data. Implications, limitations, and future research directions are also discussed.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"5-19"},"PeriodicalIF":2.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Module 37: Introduction to Item Response Tree (IRTree) Models","authors":"Nana Kim, Jiayi Deng, Yun Leng Wong","doi":"10.1111/emip.12665","DOIUrl":"https://doi.org/10.1111/emip.12665","url":null,"abstract":"<div>\u0000 \u0000 <section>\u0000 \u0000 <h3> Module Abstract</h3>\u0000 \u0000 <p>Item response tree (IRTree) models, an item response modeling approach that incorporates a tree structure, have become a popular method for many applications in measurement. IRTree models characterize the underlying response processes using a decision tree structure, where the internal decision outcome at each node is parameterized with an item response theory (IRT) model. Such models provide a flexible way of investigating and modeling underlying response processes, which can be useful for examining sources of individual differences in measurement and addressing measurement issues that traditional IRT models cannot deal with. In this module, we discuss the conceptual framework of IRTree models and demonstrate examples of their applications in the context of both cognitive and noncognitive assessments. We also introduce some possible extensions of the model and provide a demonstration of an example data analysis in R.</p>\u0000 </section>\u0000 </div>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"109-110"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: Unraveling Reading Recognition Trajectories: Classifying Student Development through Growth Mixture Modeling","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12667","DOIUrl":"https://doi.org/10.1111/emip.12667","url":null,"abstract":"<p>The cover of this issue features “<i>Unraveling Reading Recognition Trajectories: Classifying Student Development through Growth Mixture Modeling</i>” by Xingyao Xiao and Sophia Rabe-Hesketh from the University of California, Berkeley. Using advanced Bayesian growth mixture modeling, their research examines how reading recognition develops between ages 6 and 14, identifying three distinct patterns of growth. This study provides a detailed and nuanced understanding of how students’ reading abilities progress over time.</p><p>Xiao and Rabe-Hesketh illustrated their findings using a multiplot visualization. It combines model-implied class-specific mean trajectories, a shaded 50% mid-range, and box-plots of observed reading scores, effectively highlighting the variability in reading progress among different learner groups. By juxtaposing observed data with model predictions, the visualization clearly depicts diverse growth patterns. Additionally, it emphasizes the variance and covariance of random effects, offering valuable insights often overlooked in similar analyses.</p><p>The three-class model described by Xiao and Rabe-Hesketh effectively explains different patterns of student growth. The first group, termed the “Early Bloomers,” comprises about 14% of the population who start with strong reading abilities and steadily improve. By age six, they show high reading scores and greater variability in growth trajectories compared to other groups. Xiao and Rabe-Hesketh note, “These students exhibit greater variability in growth curves at age six, with an 88% likelihood for those deviating 2 standard deviations below or above the mean to stray from the average growth rate.” This highlights their potential for early reading success.</p><p>The “Rapid Catch-Up Learners” represent 35% of students, starting with lower scores but progressing rapidly to often surpass Early Bloomers by adolescence. Xiao and Rabe-Hesketh explain, “Though showing minimal heterogeneity in growth trajectories at age 6, these paths diverge due to a positive correlation between intercepts and slope. Those with trajectories 2 standard deviations above or below the mean at age 6 possess an 81% likelihood of deviating from the average growth rate.” This group highlights the potential of slower starters to excel with targeted support.</p><p>Lastly, the “Steady Progressors” start with the lowest average scores at age six but show steady, consistent growth over time. By age 14, their scores begin to overlap with those of other groups, despite maintaining an initial gap. “These students are projected to deviate 605% more from the mean at age 14 than at age 6, approximately seven times as much.” Representing a majority of students, this group highlights the importance of persistence and gradual progress.</p><p>Through their research, Xiao and Rabe-Hesketh define the diverse trajectories of reading development. Whether a student's growth is rapid, steady, or gradual, every trajectory deser","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"6"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12667","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ITEMS Corner: Next Chapter of ITEMS","authors":"Stella Y. Kim","doi":"10.1111/emip.12666","DOIUrl":"https://doi.org/10.1111/emip.12666","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"108"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: The Increasing Impact of EM:IP","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12657","DOIUrl":"https://doi.org/10.1111/emip.12657","url":null,"abstract":"<p>The cover of this issue featured “The Increasing Impact of <i>EM:IP</i>” by Zhongmin Cui, the journal's editor. Cui elaborated on the significance of the impact factor for Educational Measurement: Issues and Practice (<i>EM:IP</i>), one of the most widely recognized metrics for evaluating a journal's influence and prestige. The impact factor, which measures how frequently a journal's articles are cited over a specific period, serves as a critical tool for researchers, institutions, and funding bodies in assessing the relevance and significance of published work.</p><p>Cui noted the challenges in measuring a journal's influence, stating, “As measurement professionals, we are well aware of the difficulties in quantifying almost anything, including the impact of a journal. However, even imperfect metrics, if carefully designed, can provide valuable insights for users making informed decisions.”</p><p>He cited <i>EM:IP</i>’s latest journal impact factor of 2.7 (Wiley, <span>2024</span>), which was calculated based on citations from the previous two years. Acknowledging that this figure might not seem substantial, Cui emphasized that it represents a significant milestone in the journal's history. “The visualization we created illustrates a steady, consistent upward trend in <i>EM:IP</i>’s impact factor over the past decade. This growth reflects our ongoing commitment to publishing high-quality, impactful research that resonates with both scholars and practitioners,” he added.</p><p>Cui also stressed the growing influence of <i>EM:IP</i> in the field of educational and psychological measurement. He credited this achievement to the dedication of the authors, the insights of the reviewers, and the ongoing support of the readers. “Everyone's contributions have been crucial to our success, and we are excited to continue our mission to advance knowledge and foster scholarly discourse in the years ahead,” he expressed with gratitude.</p><p>The visualization was created using Python, following guidelines established by Setzer and Cui (<span>2022</span>). “One special feature of the graph is the use of the journal's color scheme, which enhances visual harmony, particularly for the cover design,” Cui explained. The data used to calculate the impact factor was sourced from Clarivate (https://clarivate.com/). For those interested in learning more about this data visualization, Zhongmin Cui can be contacted at [email protected].</p><p>We also invite you to participate in the annual <i>EM:IP</i> Cover Graphic/Data Visualization Competition. Details for the 2025 competition can be found in this issue. Your entry could be featured on the cover of a future issue! We're eager to receive your feedback and submissions. Please share your thoughts or questions by emailing Yuan-Ling Liaw at [email protected].</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 4","pages":"7"},"PeriodicalIF":2.7,"publicationDate":"2025-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12657","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143248586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Current Psychometric Models and Some Uses of Technology in Educational Testing","authors":"Robert L. Brennan","doi":"10.1111/emip.12644","DOIUrl":"https://doi.org/10.1111/emip.12644","url":null,"abstract":"<p>This paper addresses some issues concerning the use of current psychometric models for current (and possibly future) technology-based educational testing (as well as most licensure and certification testing). The intent here is to provide a relatively simple overview that addresses important issues, with little explicit intent to argue strenuously for or against the particular uses of technology discussed here.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"43 4","pages":"88-92"},"PeriodicalIF":2.7,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143253443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study","authors":"Guher Gorgun, Okan Bulut","doi":"10.1111/emip.12663","DOIUrl":"https://doi.org/10.1111/emip.12663","url":null,"abstract":"<p>Automatic item generation may supply many items instantly and efficiently to assessment and learning environments. Yet, the evaluation of item quality persists to be a bottleneck for deploying generated items in learning and assessment settings. In this study, we investigated the utility of using large-language models, specifically Llama 3-8B, for evaluating automatically generated cloze items. The trained large-language model was able to filter out majority of good and bad items accurately. Evaluating items automatically with instruction-tuned LLMs may aid educators and test developers in understanding the quality of items created in an efficient and scalable manner. The item evaluation process with LLMs may also act as an intermediate step between item creation and field testing to reduce the cost and time associated with multiple rounds of revision.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"96-107"},"PeriodicalIF":2.7,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12663","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}