Machine Learning based Data Analysis for Wordle Game

Frontiers in Computing and Intelligent Systems Pub Date : 2023-07-20 DOI:10.54097/fcis.v4i3.11237

Zidong Wu

{"title":"Machine Learning based Data Analysis for Wordle Game","authors":"Zidong Wu","doi":"10.54097/fcis.v4i3.11237","DOIUrl":null,"url":null,"abstract":"As wordle game is becoming more and more popular all over the world, the study of the intrinsic patterns and problem-solving techniques of the game has become a popular topic. In this paper, we obtained daily result files of wordle over a period of time, which included data such as words of the day, the number of people who reported their scores on the day, and the number of players who entered the difficult mode, and further analyzed the data. In order to predict the results reported on a future day, this paper first eliminates the abnormal data in the dataset, and then builds the LSTM model and ARIMA model. On the test set, the MAPE (mean absolute error) of the LSTM model is 5.643, and the LSTM is significantly better than the ARIMA model. Secondly, in order to predict the distribution of word results for a given future date, the 12 features of the network training data were first subjected to PCA dimensionality reduction, and the results showed that the percentages were consistent with time by performing the Shapiro-Wilk normalization test on the correlation percentages. Based on this observation, we built a BP-LSTM parallel model to extract word attributes and extract percentage features over time. The model has a MAPE of 6.09, which outperforms a BP neural network that can only extract word attributes.","PeriodicalId":346823,"journal":{"name":"Frontiers in Computing and Intelligent Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computing and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54097/fcis.v4i3.11237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

As wordle game is becoming more and more popular all over the world, the study of the intrinsic patterns and problem-solving techniques of the game has become a popular topic. In this paper, we obtained daily result files of wordle over a period of time, which included data such as words of the day, the number of people who reported their scores on the day, and the number of players who entered the difficult mode, and further analyzed the data. In order to predict the results reported on a future day, this paper first eliminates the abnormal data in the dataset, and then builds the LSTM model and ARIMA model. On the test set, the MAPE (mean absolute error) of the LSTM model is 5.643, and the LSTM is significantly better than the ARIMA model. Secondly, in order to predict the distribution of word results for a given future date, the 12 features of the network training data were first subjected to PCA dimensionality reduction, and the results showed that the percentages were consistent with time by performing the Shapiro-Wilk normalization test on the correlation percentages. Based on this observation, we built a BP-LSTM parallel model to extract word attributes and extract percentage features over time. The model has a MAPE of 6.09, which outperforms a BP neural network that can only extract word attributes.

查看原文本刊更多论文

基于机器学习的棋类游戏数据分析

随着世界游戏在世界范围内的日益普及，对世界游戏内在模式和解决问题的技术研究成为一个热门话题。在本文中，我们获取了一段时间内《world》的每日结果文件，包括当天的单词数、当天上报分数的人数、进入高难度模式的玩家人数等数据，并对数据进行进一步分析。为了预测未来某一天的报告结果，本文首先消除数据集中的异常数据，然后构建LSTM模型和ARIMA模型。在测试集上，LSTM模型的平均绝对误差(MAPE)为5.643,LSTM显著优于ARIMA模型。其次，为了预测未来给定日期的单词结果分布，首先对网络训练数据的12个特征进行PCA降维，通过对相关百分比进行Shapiro-Wilk归一化检验，结果表明百分比与时间一致。基于这一观察，我们建立了BP-LSTM并行模型来提取单词属性和提取随时间变化的百分比特征。该模型的MAPE为6.09，优于只能提取词属性的BP神经网络。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers in Computing and Intelligent Systems

自引率

0.00%

发文量