Jiawei Xiong, Shiyu Wang, Cheng Tang, Qidi Liu, Rufei Sheng, Bowen Wang, Huan Kuang, Allan S. Cohen, Xinhui Xiong
{"title":"Sequential Reservoir Computing for Log File‐Based Behavior Process Data Analyses","authors":"Jiawei Xiong, Shiyu Wang, Cheng Tang, Qidi Liu, Rufei Sheng, Bowen Wang, Huan Kuang, Allan S. Cohen, Xinhui Xiong","doi":"10.1111/jedm.12413","DOIUrl":"https://doi.org/10.1111/jedm.12413","url":null,"abstract":"The use of process data in assessment has gained attention in recent years as more assessments are administered by computers. Process data, recorded in computer log files, capture the sequence of examinees' response activities, for example, timestamped keystrokes, during the assessment. Traditional measurement methods are often inadequate for handling this type of data. In this paper, we proposed a sequential reservoir method (SRM) based on a reservoir computing model using the echo state network, with the particle swarm optimization and singular value decomposition as optimization. Designed to regularize features from process data through a computational self‐learning algorithm, this method has been evaluated using both simulated and empirical data. Simulation results suggested that, on one hand, the model effectively transforms action sequences into standardized and meaningful features, and on the other hand, these features are instrumental in categorizing latent behavioral groups and predicting latent information. Empirical results further indicate that SRM can predict assessment efficiency. The features extracted by SRM have been verified as related to action sequence lengths through the correlation analysis. This proposed method enhances the extraction and accessibility of meaningful information from process data, presenting an alternative to existing process data technologies.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"16 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142253879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploring Latent Constructs through Multimodal Data Analysis","authors":"Shiyu Wang, Shushan Wu, Yinghan Chen, Luyang Fang, Liang Xiao, Feiming Li","doi":"10.1111/jedm.12412","DOIUrl":"https://doi.org/10.1111/jedm.12412","url":null,"abstract":"This study presents a comprehensive analysis of three types of multimodal data‐response accuracy, response times, and eye‐tracking data‐derived from a computer‐based spatial rotation test. To tackle the complexity of high‐dimensional data analysis challenges, we have developed a methodological framework incorporating various statistical and machine learning methods. The results of our study reveal that hidden state transition probabilities, based on eye‐tracking features, may be contingent on skill mastery estimated from the fluency CDM model. The hidden state trajectory offers additional diagnostic insights into spatial rotation problem‐solving, surpassing the information provided by the fluency CDM alone. Furthermore, the distribution of participants across different hidden states reflects the intricate nature of visualizing objects in each item, adding a nuanced dimension to the characterization of item features. This complements the information obtained from item parameters in the fluency CDM model, which relies on response accuracy and response time. Our findings have the potential to pave the way for the development of new psychometric and statistical models capable of seamlessly integrating various types of multimodal data. This integrated approach promises more meaningful and interpretable results, with implications for advancing the understanding of cognitive processes involved in spatial rotation tests.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"69 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142211178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyo Jeong Shin, Christoph König, Frederic Robin, Andreas Frey, Kentaro Yamamoto
{"title":"Robustness of Item Response Theory Models under the PISA Multistage Adaptive Testing Designs","authors":"Hyo Jeong Shin, Christoph König, Frederic Robin, Andreas Frey, Kentaro Yamamoto","doi":"10.1111/jedm.12409","DOIUrl":"https://doi.org/10.1111/jedm.12409","url":null,"abstract":"Many international large‐scale assessments (ILSAs) have switched to multistage adaptive testing (MST) designs to improve measurement efficiency in measuring the skills of the heterogeneous populations around the world. In this context, previous literature has reported the acceptable level of model parameter recovery under the MST designs when the current item response theory (IRT)‐based scaling models are used. However, previous studies have not considered the influence of realistic phenomena commonly observed in ILSA data, such as item‐by‐country interactions, repeated use of MST designs in subsequent cycles, and nonresponse, including omitted and not‐reached items. The purpose of this study is to examine the robustness of current IRT‐based scaling models to these three factors under MST designs, using the Programme for International Student Assessment (PISA) designs as an example. A series of simulation studies show that the IRT scaling models used in the PISA are robust to repeated use of the MST design in a subsequent cycle with fewer items and smaller sample sizes, while item‐by‐country interactions and items not‐reached have negligible to modest effects on model parameter estimation, and omitted responses have the largest effect. The discussion section provides recommendations and implications for future MST designs and scaling models for ILSAs.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"75 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141882915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sun‐Joo Cho, Amanda Goodwin, Matthew Naveiras, Paul De Boeck
{"title":"Modeling Nonlinear Effects of Person‐by‐Item Covariates in Explanatory Item Response Models: Exploratory Plots and Modeling Using Smooth Functions","authors":"Sun‐Joo Cho, Amanda Goodwin, Matthew Naveiras, Paul De Boeck","doi":"10.1111/jedm.12410","DOIUrl":"https://doi.org/10.1111/jedm.12410","url":null,"abstract":"Explanatory item response models (EIRMs) have been applied to investigate the effects of person covariates, item covariates, and their interactions in the fields of reading education and psycholinguistics. In practice, it is often assumed that the relationships between the covariates and the logit transformation of item response probability are linear. However, this linearity assumption obscures the differential effects of covariates over their range in the presence of nonlinearity. Therefore, this paper presents exploratory plots that describe the potential nonlinear effects of person and item covariates on binary outcome variables. This paper also illustrates the use of EIRMs with smooth functions to model these nonlinear effects. The smooth functions examined in this study include univariate smooths of continuous person or item covariates, tensor product smooths of continuous person and item covariates, and by‐variable smooths between a continuous person covariate and a binary item covariate. Parameter estimation was performed using the <jats:styled-content>mgcv</jats:styled-content> <jats:styled-content>R</jats:styled-content> package through the maximum penalized likelihood estimation method. In the empirical study, we identified a nonlinear effect of the person‐by‐item covariate interaction and discussed its practical implications. Furthermore, the parameter recovery and the model comparison method and hypothesis testing procedures presented were evaluated via simulation studies under the same conditions observed in the empirical study.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"16 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141776807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the Choice of Parameters for the Lognormal Model for Response Times: Commentary on Becker et al. (2013)","authors":"Wim J. van der Linden","doi":"10.1111/jedm.12411","DOIUrl":"https://doi.org/10.1111/jedm.12411","url":null,"abstract":"In a recently published article in this journal, Becker et al. claim that, because of a missing slope parameter, the lognormal model for response times on test items almost never holds in practice. However, the authors' critique rests on a misrepresentation of the model, which already does have the equivalent of a slope parameter. More importantly, their extra parameter spoils the interpretation of the parameters for the test‐takers' speed and labor intensity of the items necessary for a response‐time model to be empirically meaningful while their proposed interpretation of the extra parameter seems unwarranted. An analysis of the authors' earlier empirical comparison between the original and their alternative version of the model does not seem to support much of a conclusion about the relative fit of the two models. Also, their simulation study conducted to demonstrate the necessity of the extra slope parameter appears to be based on data simulated in favor of their parameter.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"66 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141776808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reckase, M.The Psychometrics of Standard Setting: Connecting Policy and Test Scores: First edition published 2023 by CRC Press, 6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487‐2742","authors":"Daniel Lewis, Sandip Sinharay","doi":"10.1111/jedm.12407","DOIUrl":"https://doi.org/10.1111/jedm.12407","url":null,"abstract":"","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"54 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141776809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Automated Procedures to Score Educational Essays Written in Three Languages","authors":"Tahereh Firoozi, Hamid Mohammadi, Mark J. Gierl","doi":"10.1111/jedm.12406","DOIUrl":"https://doi.org/10.1111/jedm.12406","url":null,"abstract":"The purpose of this study is to describe and evaluate a multilingual automated essay scoring (AES) system for grading essays in three languages. Two different sentence embedding models were evaluated within the AES system, multilingual BERT (mBERT) and language‐agnostic BERT sentence embedding (LaBSE). German, Italian, and Czech essays were holistically scored using the Common European Framework of Reference of Languages. The AES system with mBERT produced results that were consistent with human raters overall across all three language groups. The system also produced accurate predictions for some but not all of the score levels within each language. The AES system with LaBSE produced results that were even more consistent with the human raters overall across all three language groups compared to mBERT. In addition, the system produced accurate predictions for the majority of the score levels within each language. The performance differences between mBERT and LaBSE can be explained by considering how each language embedding model is implemented. Implications of this study for educational testing are also discussed.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"59 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141776810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Harold Doran, Testsuhiro Yamada, Ted Diaz, Emre Gonulates, Vanessa Culver
{"title":"A Generalized Objective Function for Computer Adaptive Item Selection","authors":"Harold Doran, Testsuhiro Yamada, Ted Diaz, Emre Gonulates, Vanessa Culver","doi":"10.1111/jedm.12405","DOIUrl":"https://doi.org/10.1111/jedm.12405","url":null,"abstract":"Computer adaptive testing (CAT) is an increasingly common mode of test administration offering improved test security, better measurement precision, and the potential for shorter testing experiences. This article presents a new item selection algorithm based on a generalized objective function to support multiple types of testing conditions and principled assessment design. The generalized nature of the algorithm permits a wide array of test requirements allowing experts to define what to measure and how to measure it and the algorithm is simply a means to an end to support better construct representation. This work also emphasizes the computational algorithm and its ability to scale to support faster computing and better cost‐containment in real‐world applications than other CAT algorithms. We make a significant effort to consolidate all information needed to build and scale the algorithm so that expert psychometricians and software developers can use this document as a self‐contained resource and specification document to build and deploy an operational CAT platform.","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"144 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141528216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cornelis Potgieter, Xin Qiao, Akihito Kamata, Yusuf Kara
{"title":"Likelihood-Based Estimation of Model-Derived Oral Reading Fluency","authors":"Cornelis Potgieter, Xin Qiao, Akihito Kamata, Yusuf Kara","doi":"10.1111/jedm.12404","DOIUrl":"10.1111/jedm.12404","url":null,"abstract":"<p>As part of the effort to develop an improved oral reading fluency (ORF) assessment system, Kara et al. estimated the ORF scores based on a latent variable psychometric model of accuracy and speed for ORF data via a fully Bayesian approach. This study further investigates likelihood-based estimators for the model-derived ORF scores, including maximum likelihood estimator (MLE), maximum a posteriori (MAP), and expected a posteriori (EAP), as well as their standard errors. The proposed estimators were demonstrated with a real ORF assessment dataset. Also, the estimation of model-derived ORF scores and their standard errors by the proposed estimators were evaluated through a simulation study. The fully Bayesian approach was included as a comparison in the real data analysis and the simulation study. Results demonstrated that the three likelihood-based approaches for the model-derived ORF scores and their standard error estimation performed satisfactorily.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"61 3","pages":"542-559"},"PeriodicalIF":1.4,"publicationDate":"2024-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141505203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Curvilinearity in the Reference Composite and Practical Implications for Measurement","authors":"Xiangyi Liao, Daniel M. Bolt, Jee-Seon Kim","doi":"10.1111/jedm.12402","DOIUrl":"10.1111/jedm.12402","url":null,"abstract":"<p>Item difficulty and dimensionality often correlate, implying that unidimensional IRT approximations to multidimensional data (i.e., reference composites) can take a curvilinear form in the multidimensional space. Although this issue has been previously discussed in the context of vertical scaling applications, we illustrate how such a phenomenon can also easily occur within individual tests. Measures of reading proficiency, for example, often use different task types within a single assessment, a feature that may not only lead to multidimensionality, but also an association between item difficulty and dimensionality. Using a latent regression strategy, we demonstrate through simulations and empirical analysis how associations between dimensionality and difficulty yield a nonlinear reference composite where the weights of the underlying dimensions <i>change</i> across the scale continuum according to the difficulties of the items associated with the dimensions. We further show how this form of curvilinearity produces systematic forms of misspecification in traditional unidimensional IRT models (e.g., 2PL) and can be better accommodated by models such as monotone-polynomial or asymmetric IRT models. Simulations and a real-data example from the Early Childhood Longitudinal Study—Kindergarten are provided for demonstration. Some implications for measurement modeling and for understanding the effects of 2PL misspecification on measurement metrics are discussed.</p>","PeriodicalId":47871,"journal":{"name":"Journal of Educational Measurement","volume":"61 3","pages":"511-541"},"PeriodicalIF":1.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/jedm.12402","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141386190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}