Improving Students’ Performance by Interpretable Explanations using Ensemble Tree-Based Approaches

2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI) Pub Date : 2021-05-19 DOI:10.1109/SACI51354.2021.9465558

Alexandra Vultureanu-Albisi, C. Bǎdicǎ

{"title":"Improving Students’ Performance by Interpretable Explanations using Ensemble Tree-Based Approaches","authors":"Alexandra Vultureanu-Albisi, C. Bǎdicǎ","doi":"10.1109/SACI51354.2021.9465558","DOIUrl":null,"url":null,"abstract":"The careful analysis and evaluation of students’ results are an important part of the educational activity, with a potentially strong impact on the students’ future development. Seven classification algorithms, which are Decision Tree, Bagging, Random Forest, AdaBoost, Gradient Boosting, XGBoost, and LightGBM, were used in this research. In this paper, for our experiments we used two datasets, the first refers to classify and predict Portuguese language performance and the second for students’ level at courses. In this paper, we propose to identify the most appropriate classification technique to improve the prediction of students’ performance, interpreting it using the LIME algorithm. The obtained results using both datasets show that the model built using Decision Tree, outperforms the other constructed models. Our methodology consists of four major steps: i) analyzing and preprocessing the dataset; ii) optimizing the models using cross-validation and hyperparameter tuning; iii) comparing the performance of different ensemble tree-based models, and iv) interpreting the model by providing explanations. The development of explainable models can lead to important advantages: the model can be trusted, the transparency of the model helps to understand the underlying mechanisms that make the model work and opaque models can be interpreted without sacrificing their predictive performance.","PeriodicalId":321907,"journal":{"name":"2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SACI51354.2021.9465558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

The careful analysis and evaluation of students’ results are an important part of the educational activity, with a potentially strong impact on the students’ future development. Seven classification algorithms, which are Decision Tree, Bagging, Random Forest, AdaBoost, Gradient Boosting, XGBoost, and LightGBM, were used in this research. In this paper, for our experiments we used two datasets, the first refers to classify and predict Portuguese language performance and the second for students’ level at courses. In this paper, we propose to identify the most appropriate classification technique to improve the prediction of students’ performance, interpreting it using the LIME algorithm. The obtained results using both datasets show that the model built using Decision Tree, outperforms the other constructed models. Our methodology consists of four major steps: i) analyzing and preprocessing the dataset; ii) optimizing the models using cross-validation and hyperparameter tuning; iii) comparing the performance of different ensemble tree-based models, and iv) interpreting the model by providing explanations. The development of explainable models can lead to important advantages: the model can be trusted, the transparency of the model helps to understand the underlying mechanisms that make the model work and opaque models can be interpreted without sacrificing their predictive performance.

查看原文本刊更多论文

使用基于集成树的方法通过可解释的解释来提高学生的表现

对学生成绩的认真分析和评价是教育活动的重要组成部分，对学生的未来发展具有潜在的强烈影响。本研究采用了决策树、Bagging、Random Forest、AdaBoost、Gradient Boosting、XGBoost和LightGBM 7种分类算法。在本文中，我们的实验使用了两个数据集，第一个用于分类和预测葡萄牙语的语言表现，第二个用于学生在课程中的水平。在本文中，我们建议确定最合适的分类技术，以提高对学生成绩的预测，并使用LIME算法对其进行解释。使用两个数据集获得的结果表明，使用决策树构建的模型优于其他构建的模型。我们的方法包括四个主要步骤:i)分析和预处理数据集;Ii)使用交叉验证和超参数调整优化模型;Iii)比较不同集成树模型的性能;iv)通过提供解释来解释模型。可解释模型的开发可以带来重要的优势:模型可以被信任，模型的透明性有助于理解使模型工作的底层机制，并且可以在不牺牲其预测性能的情况下解释不透明的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE 15th International Symposium on Applied Computational Intelligence and Informatics (SACI)

自引率

0.00%

发文量