{"title":"On the Cover: Sequential Progression and Item Review in Timed Tests: Patterns in Process Data","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12670","DOIUrl":"https://doi.org/10.1111/emip.12670","url":null,"abstract":"<p>We are excited to announce the winners of the 12th <i>EM:IP</i> Cover Graphic/Data Visualization Competition. Each year, we invite our readers to submit visualizations that are not only accurate and insightful but also visually compelling and easy to understand. This year's submissions explored key topics in educational measurement, including process data, item characteristics, test design, and score interpretation. We extend our sincere thanks to everyone who submitted their work, and we are especially grateful to the <i>EM:IP</i> editorial board for their thoughtful review and feedback in the selection process.</p><p>Winning entries may be featured on the cover of a future <i>EM:IP</i> issue. Previous winners who have not yet appeared on a cover remain eligible for upcoming issues.</p><p>This issue's cover features Sequential Progression and Item Review in Timed Tests: Patterns in Process Data, a compelling visualization created by Christian Meyer from the Association of American Medical Colleges and the University of Maryland, along with Ying Jin and Marc Kroopnick, both from the Association of American Medical Colleges.</p><p>The visualization, developed using R, presents smoothed density plots derived from process data collected during a high-stakes admissions test. It illustrates how examinees navigated one section of the test within a 95-minute time limit. The <i>x</i>-axis represents elapsed time in minutes. The <i>y</i>-axis segments item positions into five groups: 1 to 15, 16 to 25, 26 to 35, 36 to 45, and 46 to 59. Meyer and his colleagues explain that, for each item group, the height of the plot indicates density. The supports of the estimated densities extend beyond the start and end of the test to allow the plots to approach zero smoothly at the extremes.</p><p>Color is used effectively to distinguish between initial engagement and item review. Blue areas indicate when items were first viewed, while red areas show when examinees revisited those same items. The authors describe, “The figure illustrates a common test-taking strategy: examinees initially progress sequentially through the test, as shown by the early blue density peaks for each group. Toward the end of the session, they frequently revisit earlier items, as evidenced by the red peaks clustering near the time limit.” This pattern reflects deliberate time management, with examinees dividing their approach into two distinct phases.</p><p>They continue, “In the first phase, they assess each item, either attempting a response or skipping it for later review. In the second phase, they revisit skipped or uncertain items, providing more considered answers when time permits or resorting to random guessing if necessary.”</p><p>According to Meyer and his colleagues, the visualization offers valuable insight into examinees’ time management and engagement strategies during timed tests. They conclude, “It captures temporal strategies, such as sequential progression and end-of-sessi","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":""},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12670","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Module 38: Differential Item Functioning by Multiple Variables Using Moderated Nonlinear Factor Analysis","authors":"Sanford R. Student, Ethan M. McCormick","doi":"10.1111/emip.12669","DOIUrl":"https://doi.org/10.1111/emip.12669","url":null,"abstract":"<div>\u0000 \u0000 <section>\u0000 \u0000 <h3> Module Abstract</h3>\u0000 \u0000 <p>When investigating potential bias in educational test items via differential item functioning (DIF) analysis, researchers have historically been limited to comparing two groups of students at a time. The recent introduction of Moderated Nonlinear Factor Analysis (MNLFA) generalizes Item Response Theory models to extend the assessment of DIF to an arbitrary number of background variables. This facilitates more complex analyses such as DIF across more than two groups (e.g. low/middle/high socioeconomic status), across more than one background variable (e.g. DIF by race/ethnicity and gender), across non-categorical background variables (e.g. DIF by parental income), and more. Framing MNLFA as a generalization of the two-parameter logistic IRT model, we introduce the model with an emphasis on the parameters representing DIF versus impact; describe the current state of the art for estimating MNLFA models; and illustrate the application of MNLFA in a scenario where one wants to test for DIF across two background variables at once.</p>\u0000 </section>\u0000 </div>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"39-41"},"PeriodicalIF":2.7,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"2024 NCME Presidential Address: Challenging Traditional Views of Measurement","authors":"Michael E. Walker","doi":"10.1111/emip.12673","DOIUrl":"https://doi.org/10.1111/emip.12673","url":null,"abstract":"<p>This article is adapted from the 2024 NCME Presidential Address. It reflects a personal journey to challenge traditional views of measurement. Considering alternative viewpoints with an open mind led to several solutions to perplexing problems at the time. The article discusses the culture-boundedness of measurement and the need to take that into consideration when designing tests.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"32-38"},"PeriodicalIF":2.7,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalizability Theory Approach to Analyzing Automated-Item Generated Test Forms","authors":"Stella Y. Kim, Sungyeun Kim","doi":"10.1111/emip.12671","DOIUrl":"https://doi.org/10.1111/emip.12671","url":null,"abstract":"<p>This study presents several multivariate Generalizability theory designs for analyzing automatic item-generated (AIG) based test forms. The study used real data to illustrate the analysis procedure and discuss practical considerations. We collected the data from two groups of students, each group receiving a different form generated by AIG. A total of 74 students participated in this study and responded to AIG-based test forms. Then, we analyzed the data using four distinct designs based on the data collection design, and conceptualization of true scores and measurement conditions over hypothetical replications. This study also examined the theoretical relationships among the four data collection designs and highlighted the potential impact of confounding between item templates and item clones.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"20-31"},"PeriodicalIF":2.7,"publicationDate":"2025-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144118008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mo Zhang, Paul Deane, Andrew Hoang, Hongwen Guo, Chen Li
{"title":"Applications and Modeling of Keystroke Logs in Writing Assessments","authors":"Mo Zhang, Paul Deane, Andrew Hoang, Hongwen Guo, Chen Li","doi":"10.1111/emip.12668","DOIUrl":"https://doi.org/10.1111/emip.12668","url":null,"abstract":"<p>In this paper, we describe two empirical studies that demonstrate the application and modeling of keystroke logs in writing assessments. We illustrate two different approaches of modeling differences in writing processes: analysis of mean differences in handcrafted theory-driven features and use of large language models to identify stable personal characteristics. In the first study, we examined the effects of test environment on writing characteristics: at-home versus in-center, using features extracted from keystroke logs. In a second study, we explored ways to measure stable personal characteristics and traits. As opposed to feature engineering that can be difficult to scale, raw keystroke logs were used as input in the second study, and large language models were developed to infer latent relations in the data. Implications, limitations, and future research directions are also discussed.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 2","pages":"5-19"},"PeriodicalIF":2.7,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144117931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Module 37: Introduction to Item Response Tree (IRTree) Models","authors":"Nana Kim, Jiayi Deng, Yun Leng Wong","doi":"10.1111/emip.12665","DOIUrl":"https://doi.org/10.1111/emip.12665","url":null,"abstract":"<div>\u0000 \u0000 <section>\u0000 \u0000 <h3> Module Abstract</h3>\u0000 \u0000 <p>Item response tree (IRTree) models, an item response modeling approach that incorporates a tree structure, have become a popular method for many applications in measurement. IRTree models characterize the underlying response processes using a decision tree structure, where the internal decision outcome at each node is parameterized with an item response theory (IRT) model. Such models provide a flexible way of investigating and modeling underlying response processes, which can be useful for examining sources of individual differences in measurement and addressing measurement issues that traditional IRT models cannot deal with. In this module, we discuss the conceptual framework of IRTree models and demonstrate examples of their applications in the context of both cognitive and noncognitive assessments. We also introduce some possible extensions of the model and provide a demonstration of an example data analysis in R.</p>\u0000 </section>\u0000 </div>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"109-110"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Cover: Unraveling Reading Recognition Trajectories: Classifying Student Development through Growth Mixture Modeling","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12667","DOIUrl":"https://doi.org/10.1111/emip.12667","url":null,"abstract":"<p>The cover of this issue features “<i>Unraveling Reading Recognition Trajectories: Classifying Student Development through Growth Mixture Modeling</i>” by Xingyao Xiao and Sophia Rabe-Hesketh from the University of California, Berkeley. Using advanced Bayesian growth mixture modeling, their research examines how reading recognition develops between ages 6 and 14, identifying three distinct patterns of growth. This study provides a detailed and nuanced understanding of how students’ reading abilities progress over time.</p><p>Xiao and Rabe-Hesketh illustrated their findings using a multiplot visualization. It combines model-implied class-specific mean trajectories, a shaded 50% mid-range, and box-plots of observed reading scores, effectively highlighting the variability in reading progress among different learner groups. By juxtaposing observed data with model predictions, the visualization clearly depicts diverse growth patterns. Additionally, it emphasizes the variance and covariance of random effects, offering valuable insights often overlooked in similar analyses.</p><p>The three-class model described by Xiao and Rabe-Hesketh effectively explains different patterns of student growth. The first group, termed the “Early Bloomers,” comprises about 14% of the population who start with strong reading abilities and steadily improve. By age six, they show high reading scores and greater variability in growth trajectories compared to other groups. Xiao and Rabe-Hesketh note, “These students exhibit greater variability in growth curves at age six, with an 88% likelihood for those deviating 2 standard deviations below or above the mean to stray from the average growth rate.” This highlights their potential for early reading success.</p><p>The “Rapid Catch-Up Learners” represent 35% of students, starting with lower scores but progressing rapidly to often surpass Early Bloomers by adolescence. Xiao and Rabe-Hesketh explain, “Though showing minimal heterogeneity in growth trajectories at age 6, these paths diverge due to a positive correlation between intercepts and slope. Those with trajectories 2 standard deviations above or below the mean at age 6 possess an 81% likelihood of deviating from the average growth rate.” This group highlights the potential of slower starters to excel with targeted support.</p><p>Lastly, the “Steady Progressors” start with the lowest average scores at age six but show steady, consistent growth over time. By age 14, their scores begin to overlap with those of other groups, despite maintaining an initial gap. “These students are projected to deviate 605% more from the mean at age 14 than at age 6, approximately seven times as much.” Representing a majority of students, this group highlights the importance of persistence and gradual progress.</p><p>Through their research, Xiao and Rabe-Hesketh define the diverse trajectories of reading development. Whether a student's growth is rapid, steady, or gradual, every trajectory deser","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"6"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12667","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ITEMS Corner: Next Chapter of ITEMS","authors":"Stella Y. Kim","doi":"10.1111/emip.12666","DOIUrl":"https://doi.org/10.1111/emip.12666","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"44 1","pages":"108"},"PeriodicalIF":2.7,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143423608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}