Predictive insights into U.S. students’ mathematics performance on PISA 2022 using ensemble tree-based machine learning models

IF 2.6 3区 教育学 Q1 EDUCATION & EDUCATIONAL RESEARCH
Li Zhu , Hyesun You , Minju Hong , Zhenhan Fang
{"title":"Predictive insights into U.S. students’ mathematics performance on PISA 2022 using ensemble tree-based machine learning models","authors":"Li Zhu ,&nbsp;Hyesun You ,&nbsp;Minju Hong ,&nbsp;Zhenhan Fang","doi":"10.1016/j.ijer.2025.102537","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>In the latest Program for International Student Assessment (PISA) 2022 results, U.S. students earned the lowest math scores in two decades. Educators and stakeholders have endeavored to identify key malleable factors in an attempt to raise scores. Although more researchers are gradually incorporating machine learning (ML) techniques, most still rely on literature reviews by humans to identify important predictors. Here we focus on providing innovative insights into how to use ML models to identify predictors most strongly associated with students’ math performance.</div></div><div><h3>Methods and Results</h3><div>The dataset comprises 4,552 U.S. students in 154 schools from the PISA 2022. We used three ensemble tree-based ML models (Random Forest, XGBoost, and LightGBM) to select most influential predictors from 143 derived variables of student and school questionnaires. All three models showed high accuracy in predicting students’ math performance, with XGBoost performing best (rMSE = 69.82, training time = 4.14 s) and identifying 10 significant predictors. According to the accumulated local effects (ALEs) plots, three of them have general positive effects, five have roughly negative effects, and two have mixed effects on students’ math performance. When comparing these ML-identified predictors to those identified by literature review, the ML method has significantly improved the accuracy of predictor selection (<em>p</em>-value &lt; 0.05) but offered lower interpretability.</div></div><div><h3>Conclusions</h3><div>We conclude that ML predictor selection is an effective alternative to LR for obtaining influential factors affecting student learning outcomes. Among the factors identified, math self-efficacy, ESCS, and math anxiety are strongly correlate to students’ math performance. The results provide valuable insights to implement shifts in instructional practices, targeted interventions, curriculum development, and policy decisions, ultimately contributing to enhancing the overall quality of U.S. math education.</div></div>","PeriodicalId":48076,"journal":{"name":"International Journal of Educational Research","volume":"130 ","pages":"Article 102537"},"PeriodicalIF":2.6000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Educational Research","FirstCategoryId":"95","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0883035525000047","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0

Abstract

Background

In the latest Program for International Student Assessment (PISA) 2022 results, U.S. students earned the lowest math scores in two decades. Educators and stakeholders have endeavored to identify key malleable factors in an attempt to raise scores. Although more researchers are gradually incorporating machine learning (ML) techniques, most still rely on literature reviews by humans to identify important predictors. Here we focus on providing innovative insights into how to use ML models to identify predictors most strongly associated with students’ math performance.

Methods and Results

The dataset comprises 4,552 U.S. students in 154 schools from the PISA 2022. We used three ensemble tree-based ML models (Random Forest, XGBoost, and LightGBM) to select most influential predictors from 143 derived variables of student and school questionnaires. All three models showed high accuracy in predicting students’ math performance, with XGBoost performing best (rMSE = 69.82, training time = 4.14 s) and identifying 10 significant predictors. According to the accumulated local effects (ALEs) plots, three of them have general positive effects, five have roughly negative effects, and two have mixed effects on students’ math performance. When comparing these ML-identified predictors to those identified by literature review, the ML method has significantly improved the accuracy of predictor selection (p-value < 0.05) but offered lower interpretability.

Conclusions

We conclude that ML predictor selection is an effective alternative to LR for obtaining influential factors affecting student learning outcomes. Among the factors identified, math self-efficacy, ESCS, and math anxiety are strongly correlate to students’ math performance. The results provide valuable insights to implement shifts in instructional practices, targeted interventions, curriculum development, and policy decisions, ultimately contributing to enhancing the overall quality of U.S. math education.
求助全文
约1分钟内获得全文 求助全文
来源期刊
International Journal of Educational Research
International Journal of Educational Research EDUCATION & EDUCATIONAL RESEARCH-
CiteScore
6.20
自引率
3.10%
发文量
141
审稿时长
21 days
期刊介绍: The International Journal of Educational Research publishes regular papers and special issues on specific topics of interest to international audiences of educational researchers. Examples of recent Special Issues published in the journal illustrate the breadth of topics that have be included in the journal: Students Perspectives on Learning Environments, Social, Motivational and Emotional Aspects of Learning Disabilities, Epistemological Beliefs and Domain, Analyzing Mathematics Classroom Cultures and Practices, and Music Education: A site for collaborative creativity.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信