Analisis Sentimen Data Twitter Tentang Pasangan Capres-Cawapres Pemilu 2019 Dengan Metode Lexicon Based Dan Support Vector Machine

Jurnal Ilmiah FIFO Pub Date : 2019-11-01 DOI:10.22441/FIFO.2019.V11I2.004

D. Seno, Arief Wibowo

{"title":"Analisis Sentimen Data Twitter Tentang Pasangan Capres-Cawapres Pemilu 2019 Dengan Metode Lexicon Based Dan Support Vector Machine","authors":"D. Seno, Arief Wibowo","doi":"10.22441/FIFO.2019.V11I2.004","DOIUrl":null,"url":null,"abstract":"Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.","PeriodicalId":280491,"journal":{"name":"Jurnal Ilmiah FIFO","volume":"92 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Ilmiah FIFO","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22441/FIFO.2019.V11I2.004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.

查看原文本刊更多论文

Twitter上对2019年副总统候选人候选人的数据分析，使用基于Lexicon的方法和支持向量机

社交媒体写作内容的不断增长使得Twitter上出现了大量以单词和缩写形式出现的新词，使得情感分析越来越难以在Twitter社交媒体上获得高精度的文本数据。在本研究中，作者对2019年印度尼西亚总统和副总统候选人对进行了情绪分析研究。为了获得更高的准确率结果，并适应Twitter上文本数据开发的问题，作者采用了无监督和有监督两种方法相结合的方法进行了情感分析。即基于词典的。本研究使用2018年10月的Twitter数据，使用搜索关键字，其中包含2019年总统和副总统选举的每对候选人的姓名，共计800个数据集。在800个数据集的研究中，准确率最高，训练数据占80%，测试数据占20%，准确率为92.5%，每个类别的Precision值在85.7% - 97.2%之间，Recall值在78.2% - 93.5%之间。使用基于Lexicon的方法作为标注数据集，支持向量机数据集的标注过程不再手工完成，而是通过基于Lexicon的方法进行处理，并且可以随着Twitter社交媒体上数据内容的发展而添加词典上的词典。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Jurnal Ilmiah FIFO

自引率

0.00%

发文量