Investigating Computational Identity and Empowerment of The Students Studying Programming: A Text Mining Study

Necmettin Erbakan Universitesi Eregli Egitim Fakultesi, Necmettin Erbakan University Pub Date : 2023-06-30 DOI:10.51119/ereegf.2023.29

Nilüfer Atman Uslu, Aytuğ Onan

{"title":"Investigating Computational Identity and Empowerment of The Students Studying Programming: A Text Mining Study","authors":"Nilüfer Atman Uslu, Aytuğ Onan","doi":"10.51119/ereegf.2023.29","DOIUrl":null,"url":null,"abstract":"In this study, it is aimed to predict the data obtained from the answers given by the students who receive programming education to open-ended questions with text mining algorithms. Thus, text-based data on computational identity and programming empowement were analyzed and the performances of different algorithms were compared. The participants of the research consisted of 646 students whose age range was between 12-20 and who received programming education. An electronic form consisting of open-ended questions was prepared to collect the opinions of the students who received programming education. A total of six open-ended questions have been prepared about computational identity and (3 questions) and programming empowerment (3 questions). The text mining process was followed in the analysis of the data set. Analyzes were made in Python 3.8 program. In the study, the performance of Word2vec (W2v) and Term Frequency-Inverse Document Frequency (TF-IDF) word representation methods with five machine learning algorithms were compared: (a) Logistic regression, (b) Decision tree, (c) Support Vector Machines, (d) Random Forest, (e) Neural Network. Regarding computational identity, the highest prediction accuracy was found in artificial neural network (tf-idf) and logistic regression (tf-idf) algorithms. These algorithms have an accuracy rate of 93% regarding computational identity. It was determined that the logistic regression (tf-idf) method reached the highest accuracy prediction rate (96%) in programming empowement. Following this method, the accuracy rate of random forest (tf-idf), support vector machine (tf-idf) and artificial neural network (tf-idf) algorithms was 94%. The fact that these obtained values are above 90% indicates that the estimation performance is sufficient.","PeriodicalId":279974,"journal":{"name":"Necmettin Erbakan Universitesi Eregli Egitim Fakultesi, Necmettin Erbakan University","volume":"102 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Necmettin Erbakan Universitesi Eregli Egitim Fakultesi, Necmettin Erbakan University","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51119/ereegf.2023.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, it is aimed to predict the data obtained from the answers given by the students who receive programming education to open-ended questions with text mining algorithms. Thus, text-based data on computational identity and programming empowement were analyzed and the performances of different algorithms were compared. The participants of the research consisted of 646 students whose age range was between 12-20 and who received programming education. An electronic form consisting of open-ended questions was prepared to collect the opinions of the students who received programming education. A total of six open-ended questions have been prepared about computational identity and (3 questions) and programming empowerment (3 questions). The text mining process was followed in the analysis of the data set. Analyzes were made in Python 3.8 program. In the study, the performance of Word2vec (W2v) and Term Frequency-Inverse Document Frequency (TF-IDF) word representation methods with five machine learning algorithms were compared: (a) Logistic regression, (b) Decision tree, (c) Support Vector Machines, (d) Random Forest, (e) Neural Network. Regarding computational identity, the highest prediction accuracy was found in artificial neural network (tf-idf) and logistic regression (tf-idf) algorithms. These algorithms have an accuracy rate of 93% regarding computational identity. It was determined that the logistic regression (tf-idf) method reached the highest accuracy prediction rate (96%) in programming empowement. Following this method, the accuracy rate of random forest (tf-idf), support vector machine (tf-idf) and artificial neural network (tf-idf) algorithms was 94%. The fact that these obtained values are above 90% indicates that the estimation performance is sufficient.

查看原文本刊更多论文

研究程序设计学生的计算身份和授权:一项文本挖掘研究

在本研究中，目的是利用文本挖掘算法预测从接受编程教育的学生对开放式问题的回答中获得的数据。因此，分析了基于文本的计算身份和编程授权数据，并比较了不同算法的性能。该研究的参与者包括646名年龄在12-20岁之间的学生，他们接受过编程教育。准备了一个由开放式问题组成的电子表格，以收集接受编程教育的学生的意见。总共准备了六个关于计算身份和(3个问题)和编程授权(3个问题)的开放式问题。在对数据集进行分析时，遵循文本挖掘过程。在Python 3.8程序中进行分析。在本研究中，比较了Word2vec (W2v)和Term Frequency- inverse Document Frequency (TF-IDF)单词表示方法在五种机器学习算法下的性能:(a) Logistic回归，(b)决策树，(c)支持向量机，(d)随机森林，(e)神经网络。在计算同一性方面，人工神经网络(tf-idf)和逻辑回归(tf-idf)算法的预测精度最高。这些算法在计算同一性方面的准确率达到93%。结果表明，逻辑回归(tf-idf)方法在编程赋权中准确率预测率最高(96%)。采用该方法，随机森林(tf-idf)、支持向量机(tf-idf)和人工神经网络(tf-idf)算法的准确率为94%。这些获得的值在90%以上的事实表明估计性能是足够的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Necmettin Erbakan Universitesi Eregli Egitim Fakultesi, Necmettin Erbakan University

自引率

0.00%

发文量