{"title":"Machine Learning based Data Analysis for Wordle Game","authors":"Zidong Wu","doi":"10.54097/fcis.v4i3.11237","DOIUrl":null,"url":null,"abstract":"As wordle game is becoming more and more popular all over the world, the study of the intrinsic patterns and problem-solving techniques of the game has become a popular topic. In this paper, we obtained daily result files of wordle over a period of time, which included data such as words of the day, the number of people who reported their scores on the day, and the number of players who entered the difficult mode, and further analyzed the data. In order to predict the results reported on a future day, this paper first eliminates the abnormal data in the dataset, and then builds the LSTM model and ARIMA model. On the test set, the MAPE (mean absolute error) of the LSTM model is 5.643, and the LSTM is significantly better than the ARIMA model. Secondly, in order to predict the distribution of word results for a given future date, the 12 features of the network training data were first subjected to PCA dimensionality reduction, and the results showed that the percentages were consistent with time by performing the Shapiro-Wilk normalization test on the correlation percentages. Based on this observation, we built a BP-LSTM parallel model to extract word attributes and extract percentage features over time. The model has a MAPE of 6.09, which outperforms a BP neural network that can only extract word attributes.","PeriodicalId":346823,"journal":{"name":"Frontiers in Computing and Intelligent Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Computing and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54097/fcis.v4i3.11237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
As wordle game is becoming more and more popular all over the world, the study of the intrinsic patterns and problem-solving techniques of the game has become a popular topic. In this paper, we obtained daily result files of wordle over a period of time, which included data such as words of the day, the number of people who reported their scores on the day, and the number of players who entered the difficult mode, and further analyzed the data. In order to predict the results reported on a future day, this paper first eliminates the abnormal data in the dataset, and then builds the LSTM model and ARIMA model. On the test set, the MAPE (mean absolute error) of the LSTM model is 5.643, and the LSTM is significantly better than the ARIMA model. Secondly, in order to predict the distribution of word results for a given future date, the 12 features of the network training data were first subjected to PCA dimensionality reduction, and the results showed that the percentages were consistent with time by performing the Shapiro-Wilk normalization test on the correlation percentages. Based on this observation, we built a BP-LSTM parallel model to extract word attributes and extract percentage features over time. The model has a MAPE of 6.09, which outperforms a BP neural network that can only extract word attributes.