{"title":"Indonesian Text Dataset for Determining Sentiment Classification Using Mechine Learning Approach","authors":"I. Syahputra, Tulus Tulus, S. Efendi","doi":"10.31289/jite.v3i2.3153","DOIUrl":null,"url":null,"abstract":"Advances in information technology encourage the emergence of unlimited textual information with the use of online media developing so rapidly that the emergence of the need for information presentation without reducing the value of the information presented. Basicaly the concept of the dataset is a general form of almost every discipline, where the dataset provides empirical basic information for research activities. Sentiment analysis is done to see opinions or feelings about a problem or identify and classify information trends from the problem. The dataset analysis in determining sentiment classification is a model of sentiment classification that has relevance to the dataset with the use of machine learning techniques with supervision that learns from experience to predict output from labeled input data and output from machine learning. The results of experiments and tests that have been carried out on machine learning techniques with supervision can classify sentiments in the tweet text properly and the level of accuracy can still be improved to a better direction with data namely baseline 100 (days) and 83 (weeks), naivebayes 100 (days) and 82 (weeks), maxent 100 (days) and 83 (weeks), and SVM 100 (days) and 83 (weeks).","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31289/jite.v3i2.3153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Advances in information technology encourage the emergence of unlimited textual information with the use of online media developing so rapidly that the emergence of the need for information presentation without reducing the value of the information presented. Basicaly the concept of the dataset is a general form of almost every discipline, where the dataset provides empirical basic information for research activities. Sentiment analysis is done to see opinions or feelings about a problem or identify and classify information trends from the problem. The dataset analysis in determining sentiment classification is a model of sentiment classification that has relevance to the dataset with the use of machine learning techniques with supervision that learns from experience to predict output from labeled input data and output from machine learning. The results of experiments and tests that have been carried out on machine learning techniques with supervision can classify sentiments in the tweet text properly and the level of accuracy can still be improved to a better direction with data namely baseline 100 (days) and 83 (weeks), naivebayes 100 (days) and 82 (weeks), maxent 100 (days) and 83 (weeks), and SVM 100 (days) and 83 (weeks).