{"title":"Reached or Not Reached: A Tale of Two Data Sources","authors":"Yuan-Ling Liaw","doi":"10.1111/emip.12574","DOIUrl":"10.1111/emip.12574","url":null,"abstract":"","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"4"},"PeriodicalIF":2.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47254373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ITEMS Corner Update: Recording Audio and Adding an Editorial Polish to an ITEMS Module","authors":"Brian C. Leventhal","doi":"10.1111/emip.12573","DOIUrl":"10.1111/emip.12573","url":null,"abstract":"<p>In the first issue of <i>Educational Measurement: Issues and Practice</i> (EM:IP) in 2023, I outlined the 10 steps to the <i>Instructional Topics in Educational Measurement Series (ITEMS)</i> module development process. I then detailed the first three steps in the second issue, and in this issue, I discuss Steps 4–7, focusing on the audio recording process, editorial polish, interactive activities, and learning check development. I devote space discussing each in detail to provide readers and potential authors with a better understanding of the behind-the-scenes efforts throughout the ITEMS module development process. Following this discussion, I reiterate a call for module topics and conclude by introducing the latest entry to the ITEMS module library.</p><p>Throughout content development (Step 3), authors are encouraged to draft notes or a script for each slide to assist in audio recording. After drafted content is approved by the editorial team, the author begins Step 4: audio recording. There are no special skills or software needed to record the audio, and hardware (i.e., a microphone) is provided when necessary. Audio recording is done within PowerPoint and on each slide independently. In this sense, a 20-minute module section's audio is recorded in 1–3 minutes bits so that should re-recording be required, the author does not need to fully re-record an entire section. This also facilitates smoother transitions throughout each section, leading to a more natural speaking style. Although authors are encouraged to use a script (this is helpful should re-recording be necessary), it is emphasized that the audio should not sound like reading. Rather audio should be in a similar style to that of an instructor providing a professional workshop.</p><p>Once the audio recording is complete, the work shifts to the editorial team. During Step 5, the editorial team polishes the module content and audio. On each slide, they clean up the audio by reducing background noise, editing sections of silence, and increasing or decreasing the volume. After audio editing is complete, the editorial team adds slide transitions, object animations, and other stylistic tools to assist learning. For example, transition animations and timing assist smooth continuation of thought and content from slide to slide. Animations are synced with the audio to have bullet points appear when discussed, figures fade in when mentioned, and other content displayed systematically to not overwhelm the learner. Additional stylistic tools and techniques are employed to take advantage of the digital platform. For example, graph elements (e.g., axis labels) are animated in stages, fading into view as they are described throughout the audio to help focus the learner. Shapes, such as circles or arrows, may also be added to figures to highlight specific elements when emphasized in the audio. To assist with flow and organization, the editorial team may use additional slides or flow charts. For ","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"80-81"},"PeriodicalIF":2.0,"publicationDate":"2023-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12573","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43923249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiqin Pan, Oren Livne, James A. Wollack, Sandip Sinharay
{"title":"Item Selection Algorithm Based on Collaborative Filtering for Item Exposure Control","authors":"Yiqin Pan, Oren Livne, James A. Wollack, Sandip Sinharay","doi":"10.1111/emip.12578","DOIUrl":"10.1111/emip.12578","url":null,"abstract":"<p>In computerized adaptive testing, overexposure of items in the bank is a serious problem and might result in item compromise. We develop an item selection algorithm that utilizes the entire bank well and reduces the overexposure of items. The algorithm is based on collaborative filtering and selects an item in two stages. In the first stage, a set of candidate items whose expected performance matches the examinee's current performance is selected. In the second stage, an item that is approximately matched to the examinee's observed performance is selected from the candidate set. The expected performance of an examinee on an item is predicted by autoencoders. Experiment results show that the proposed algorithm outperforms existing item selection algorithms in terms of item exposure while incurring only a small loss in measurement precision.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 4","pages":"6-18"},"PeriodicalIF":2.0,"publicationDate":"2023-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42948381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measurement Efficiency for Technology-Enhanced and Multiple-Choice Items in a K–12 Mathematics Accountability Assessment","authors":"Ozge Ersan, Yufeng Berry","doi":"10.1111/emip.12580","DOIUrl":"10.1111/emip.12580","url":null,"abstract":"<p>The increasing use of computerization in the testing industry and the need for items potentially measuring higher-order skills have led educational measurement communities to develop technology-enhanced (TE) items and conduct validity studies on the use of TE items. Parallel to this goal, the purpose of this study was to collect validity evidence comparing item information functions, expected information values, and measurement efficiencies (item information per time unit) between multiple-choice (MC) and technology-enhanced (TE) items. The data came from K–12 mathematics large-scale accountability assessments. The study results were mainly interpreted descriptively, and the presence of specific patterns between MC and TE items was examined across grades and depth of knowledge levels. Although many earlier researchers pointed out that TE items were not as efficient as MC items, the results from the study point to ways that TE items might provide more information and were more than or equally efficient as MC items overall.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 4","pages":"19-32"},"PeriodicalIF":2.0,"publicationDate":"2023-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41782558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weighing the Value of Complex Growth Estimation Methods to Evaluate Individual Student Response to Instruction","authors":"Ethan R. Van Norman","doi":"10.1111/emip.12579","DOIUrl":"10.1111/emip.12579","url":null,"abstract":"<p>Sophisticated analytic strategies have been proposed as viable methods to improve the quantification of student improvement and to assist educators in making treatment decisions. The performance of three categories of latent growth modeling techniques (linear, quadratic, and dual change) to capture growth in oral reading fluency in response to a 12-week structured supplemental reading intervention among 280 grade three students at-risk for learning disabilities were compared. Although the most complex approach (dual-change) yielded the best model fit indices, there were few practical differences between predicted values from simpler linear models. A discussion to carefully consider the relative benefits and appropriateness of increasingly complex growth modeling strategies to evaluate individual student responses to intervention is offered.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 4","pages":"33-41"},"PeriodicalIF":2.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12579","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47957423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Does It Matter How the Rigor of High School Coursework Is Measured? Gaps in Coursework Among Students and Across Grades","authors":"Burhan Ogut, Darrick Yee, Ruhan Circi, Nevin Dizdari","doi":"10.1111/emip.12577","DOIUrl":"10.1111/emip.12577","url":null,"abstract":"<p>Research\u0000shows that the intensity of high school course-taking is related to postsecondary outcomes. However, there are various approaches to measuring the intensity of students’ course-taking. This study presents new measures of coursework intensity that rely on differing levels of quantity and quality of coursework. We used these new indices to provide a current description of variations in high school course-taking across grades and student subgroups using a nationally representative dataset, the High School Longitudinal Study of 2009. Results showed that for measures emphasizing the quality of coursework the gaps in coursework among underserved students were larger and there was less upward movement in rigor across grades.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 4","pages":"42-52"},"PeriodicalIF":2.0,"publicationDate":"2023-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43299184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploration of Latent Structure in Test Revision and Review Log Data","authors":"Susu Zhang, Anqi Li, Shiyu Wang","doi":"10.1111/emip.12576","DOIUrl":"10.1111/emip.12576","url":null,"abstract":"<p>In computer-based tests allowing revision and reviews, examinees' sequence of visits and answer changes to questions can be recorded. The variable-length revision log data introduce new complexities to the collected data but, at the same time, provide additional information on examinees' test-taking behavior, which can inform test development and instructions. In the current study, we used recently proposed statistical learning methods for sequence data to provide an exploratory analysis of item-level revision and review log data. Based on the revision log data collected from computer-based classroom assessments, common prototypes of revisit and review behavior were identified. The relationship between revision behavior and various item, test, and individual covariates was further explored under a Bayesian multivariate generalized linear mixed model.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 4","pages":"53-65"},"PeriodicalIF":2.0,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12576","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45210991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying a Mixture Rasch Model-Based Approach to Standard Setting","authors":"Michael R. Peabody, Timothy J. Muckle, Yu Meng","doi":"10.1111/emip.12571","DOIUrl":"10.1111/emip.12571","url":null,"abstract":"<p>The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional standard-setting methods. We found that heterogeneity of the sample is clearly necessary for the mixture Rasch model approach to standard setting to be useful. While possibly not sufficient to determine passing standards on their own, there may be value in these data-driven models for providing additional validity evidence to support decision-making bodies entrusted with establishing cut scores. They may also provide a useful tool for evaluating existing cut scores and determining if they continue to be supported or if a new study is warranted.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"5-12"},"PeriodicalIF":2.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43146823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Do Subject Matter Experts’ Judgments of Multiple-Choice Format Suitability Predict Item Quality?","authors":"Rebecca F. Berenbon, Bridget C. McHugh","doi":"10.1111/emip.12570","DOIUrl":"10.1111/emip.12570","url":null,"abstract":"<p>To assemble a high-quality test, psychometricians rely on subject matter experts (SMEs) to write high-quality items. However, SMEs are not typically given the opportunity to provide input on which content standards are most suitable for multiple-choice questions (MCQs). In the present study, we explored the relationship between perceived MCQ suitability for a given content standard and the associated item characteristics. Prior to item writing, we surveyed SMEs on MCQ suitability for each content standard. Following field testing, we then used SMEs’ average ratings for each content standard to predict item characteristics for the tests. We analyzed multilevel models predicting item difficulty (<i>p</i> value), discrimination, and nonfunctioning distractor presence. Items were nested within courses and content standards. There was a curvilinear relationship between SMEs’ ratings and item difficulty such that very low MCQ suitability ratings were predictive of easier items. After controlling for item difficulty, items with higher MCQ suitability ratings had higher discrimination and were less likely to have one or more nonfunctioning distractors. This research has practical implications for optimizing test blueprints. Additionally, psychometricians may use these ratings to better prepare for coaching SMEs during item writing.</p>","PeriodicalId":47345,"journal":{"name":"Educational Measurement-Issues and Practice","volume":"42 3","pages":"13-21"},"PeriodicalIF":2.0,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/emip.12570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46840085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}