{"title":"Data Mining on the Fundamental Factors Influencing Mathematics Achievement: Traditional and Modern Perspectives","authors":"Burcu Koca Guler, Fulya Gokalp Yavuz","doi":"10.1111/ejed.70281","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>Assessing achievement is a complex task due to its dependence on multiple factors and the hierarchical structure of educational data, yet surveys like TIMSS offer valuable insights into its determining factors like students' mathematics anxiety. However, disregarding the nested structure of data and ignoring the assumptions of models causes poor performance such as inaccurate predictions and biased estimates. Our research utilises linear mixed models (LMMs) and machine learning (ML) techniques (e.g., REEM-tree and GP boosting) especially chosen for their abilities to model nested data and capture non-linear relationships. This study is a pioneer in the literature as these ML algorithms are implemented for the first time in TIMSS. Accordingly, mathematical tendency and emotional factors are the two primary predictors of mathematics achievement across all methods, acknowledging the possibility of potential bias due to reliance on self-report responses. However, there are variations in the effect size of the students' origins among the methods. This indicates different algorithms yield distinct results according to their inner processes and priorities, such as revealing statistical significance of predictors or contributing to predictive performance. Moreover, gender has a negligible impact across all models in our analysis, caused by cultural differences in the sample. Overall, while LMMs are widely accepted, ML methods remain competitive alternatives in prediction and flexibility. All three methods yield similar benchmarks, yet ML methods offer slightly better performance in RMSE, MAE, and MAPE while exhibiting high predictive power and capturing nonlinearity and interaction. Although they take more computation time, parallel processing mitigates this in larger datasets. Consequently, ML methods and LMMs concurrently provide broader and more precise insights in terms of predictive and inferential gains.</p>\n </div>","PeriodicalId":47585,"journal":{"name":"European Journal of Education","volume":"60 4","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Education","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/ejed.70281","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Assessing achievement is a complex task due to its dependence on multiple factors and the hierarchical structure of educational data, yet surveys like TIMSS offer valuable insights into its determining factors like students' mathematics anxiety. However, disregarding the nested structure of data and ignoring the assumptions of models causes poor performance such as inaccurate predictions and biased estimates. Our research utilises linear mixed models (LMMs) and machine learning (ML) techniques (e.g., REEM-tree and GP boosting) especially chosen for their abilities to model nested data and capture non-linear relationships. This study is a pioneer in the literature as these ML algorithms are implemented for the first time in TIMSS. Accordingly, mathematical tendency and emotional factors are the two primary predictors of mathematics achievement across all methods, acknowledging the possibility of potential bias due to reliance on self-report responses. However, there are variations in the effect size of the students' origins among the methods. This indicates different algorithms yield distinct results according to their inner processes and priorities, such as revealing statistical significance of predictors or contributing to predictive performance. Moreover, gender has a negligible impact across all models in our analysis, caused by cultural differences in the sample. Overall, while LMMs are widely accepted, ML methods remain competitive alternatives in prediction and flexibility. All three methods yield similar benchmarks, yet ML methods offer slightly better performance in RMSE, MAE, and MAPE while exhibiting high predictive power and capturing nonlinearity and interaction. Although they take more computation time, parallel processing mitigates this in larger datasets. Consequently, ML methods and LMMs concurrently provide broader and more precise insights in terms of predictive and inferential gains.
期刊介绍:
The prime aims of the European Journal of Education are: - To examine, compare and assess education policies, trends, reforms and programmes of European countries in an international perspective - To disseminate policy debates and research results to a wide audience of academics, researchers, practitioners and students of education sciences - To contribute to the policy debate at the national and European level by providing European administrators and policy-makers in international organisations, national and local governments with comparative and up-to-date material centred on specific themes of common interest.