Optimizing Random Forest Algorithm to Classify Player's Memorisation via In-game Data

Knowledge Engineering and Data Science Pub Date : 2023-10-02 DOI:10.17977/um018v6i12023p103-113

Akmal Vrisna Alzuhdi, Harits Ar Rosyid, M. Chuttur, Shah Nazir

{"title":"Optimizing Random Forest Algorithm to Classify Player's Memorisation via In-game Data","authors":"Akmal Vrisna Alzuhdi, Harits Ar Rosyid, M. Chuttur, Shah Nazir","doi":"10.17977/um018v6i12023p103-113","DOIUrl":null,"url":null,"abstract":"Assessment of a player's knowledge in game education has been around for some time. Traditional evaluation in and around a gaming session may disrupt the players' immersion. This research uses an optimized Random Forest to construct a non-invasive prediction of a game education player's Memorization via in-game data. Firstly, we obtained the dataset from a 3-month survey to record in-game data of 50 players who play 4-15 game stages of the Chem Fight (a test case game). Next, we generated three variants of datasets via the preprocessing stages: resampling method (SMOTE), normalization (min-max), and a combination of resampling and normalization. Then, we trained and optimized three Random Forest (RF) classifiers to predict the player's Memorization. We chose RF because it can generalize well given the high-dimensional dataset. We used RF as the classifier, subject to optimization using its hyperparameter: n_estimators. We implemented a Grid Search Cross Validation (GSCV) method to identify the best value of n_estimators. We utilized the statistics of GSCV results to reduce the weight of n_estimators by observing the region of interest shown by the graphs of performances of the classifiers. Overall, the classifiers fitted using the BEST n_estimators (i.e., 89, 31, 89, and 196 trees) from GSCV performed well with around 80% accuracy. Moreover, we successfully identified the smaller number of n_estimators (OPTIMAL), at least halved the BEST n_estimators. All classifiers were retrained using the OPTIMAL n_estimators (37, 12, 37, and 41 trees). We found out that the performances of the classifiers were relatively steady at ~80%. This means that we successfully optimized the Random Forest in predicting a player's Memorization when playing the Chem Fight game. An automated technique presented in this paper can monitor student interactions and evaluate their abilities based on in-game data. As such, it can offer objective data about the skills used.","PeriodicalId":52868,"journal":{"name":"Knowledge Engineering and Data Science","volume":"84 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge Engineering and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17977/um018v6i12023p103-113","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Assessment of a player's knowledge in game education has been around for some time. Traditional evaluation in and around a gaming session may disrupt the players' immersion. This research uses an optimized Random Forest to construct a non-invasive prediction of a game education player's Memorization via in-game data. Firstly, we obtained the dataset from a 3-month survey to record in-game data of 50 players who play 4-15 game stages of the Chem Fight (a test case game). Next, we generated three variants of datasets via the preprocessing stages: resampling method (SMOTE), normalization (min-max), and a combination of resampling and normalization. Then, we trained and optimized three Random Forest (RF) classifiers to predict the player's Memorization. We chose RF because it can generalize well given the high-dimensional dataset. We used RF as the classifier, subject to optimization using its hyperparameter: n_estimators. We implemented a Grid Search Cross Validation (GSCV) method to identify the best value of n_estimators. We utilized the statistics of GSCV results to reduce the weight of n_estimators by observing the region of interest shown by the graphs of performances of the classifiers. Overall, the classifiers fitted using the BEST n_estimators (i.e., 89, 31, 89, and 196 trees) from GSCV performed well with around 80% accuracy. Moreover, we successfully identified the smaller number of n_estimators (OPTIMAL), at least halved the BEST n_estimators. All classifiers were retrained using the OPTIMAL n_estimators (37, 12, 37, and 41 trees). We found out that the performances of the classifiers were relatively steady at ~80%. This means that we successfully optimized the Random Forest in predicting a player's Memorization when playing the Chem Fight game. An automated technique presented in this paper can monitor student interactions and evaluate their abilities based on in-game data. As such, it can offer objective data about the skills used.

查看原文本刊更多论文

优化随机森林算法，通过游戏内数据对玩家记忆进行分类

在游戏教育中对玩家的知识进行评估已经有一段时间了。传统的游戏评估可能会破坏玩家的沉浸感。本研究利用优化的随机森林技术，通过游戏中的数据对游戏教育玩家的记忆力进行非侵入式预测。首先，我们从一项为期 3 个月的调查中获得了数据集，记录了 50 名玩家在《化学大战》（测试案例游戏）中进行 4-15 个游戏阶段的游戏内数据。接下来，我们通过预处理阶段生成了三种不同的数据集：重采样方法（SMOTE）、归一化（最小-最大）以及重采样和归一化的组合。然后，我们训练并优化了三个随机森林（RF）分类器来预测玩家的记忆力。我们选择 RF 是因为它在高维数据集上具有良好的泛化能力。我们使用 RF 作为分类器，并使用其超参数：n_estimators 进行优化。我们采用网格搜索交叉验证（GSCV）方法来确定 n_estimators 的最佳值。我们利用 GSCV 结果的统计数据，通过观察分类器性能曲线图所显示的关注区域来降低 n_estimators 的权重。总体而言，使用 GSCV 中的 BEST n_估计器（即 89、31、89 和 196 棵树）拟合的分类器表现良好，准确率约为 80%。此外，我们还成功识别了较少数量的 n_估计器（OPTIMAL），至少比 BEST n_估计器少了一半。我们使用 OPTIMAL n_estimators 对所有分类器（37、12、37 和 41 棵树）进行了重新训练。我们发现，分类器的性能相对稳定在约 80%。这说明我们成功地优化了随机森林，使其能够预测玩家在玩化学大战游戏时的记忆力。本文介绍的自动化技术可以监控学生的互动，并根据游戏中的数据评估他们的能力。因此，它可以提供有关所用技能的客观数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge Engineering and Data Science

自引率

0.00%

发文量

审稿时长

8 weeks