基于SVM的电影评论情感分析特征提取研究

Fitria Cahyanti, Adiwijaya, S. A. Faraby
{"title":"基于SVM的电影评论情感分析特征提取研究","authors":"Fitria Cahyanti, Adiwijaya, S. A. Faraby","doi":"10.1109/ICoICT49345.2020.9166397","DOIUrl":null,"url":null,"abstract":"Watching a movie is one of the activities that reduce bored, so it is necessary to look for information about the movie, which is packaged in the form of a movie review to determine whether the movie considered for viewing or no. However, in searching for information through movie reviews, there are obstacles because there are many reviews conducted by reviewers. Therefore, sentiment analysis is needed aims to classify the movie review into positive and negative sentiments. Machine learning methods can use as a sentiment analysis classification because that can produce the best performance, the method called Support Vector Machine (SVM). That was a reason SVM classification used in sentiment analysis on movie review data. Use feature extraction of Term Frequency- Inverse Document Frequency (TF-IDF) was also carried out in the research this as a method of weighting words which then combined with the extraction of Latent features Dirichlet Allocation (LDA) as a method of modeling topics that can overcome the shortcomings of SVM. This research produced the best performance on a combination of TF-IDF and LDA, with 240 topics has 29792 features, which is 82.16%.","PeriodicalId":113108,"journal":{"name":"2020 8th International Conference on Information and Communication Technology (ICoICT)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"On The Feature Extraction For Sentiment Analysis of Movie Reviews Based on SVM\",\"authors\":\"Fitria Cahyanti, Adiwijaya, S. A. Faraby\",\"doi\":\"10.1109/ICoICT49345.2020.9166397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Watching a movie is one of the activities that reduce bored, so it is necessary to look for information about the movie, which is packaged in the form of a movie review to determine whether the movie considered for viewing or no. However, in searching for information through movie reviews, there are obstacles because there are many reviews conducted by reviewers. Therefore, sentiment analysis is needed aims to classify the movie review into positive and negative sentiments. Machine learning methods can use as a sentiment analysis classification because that can produce the best performance, the method called Support Vector Machine (SVM). That was a reason SVM classification used in sentiment analysis on movie review data. Use feature extraction of Term Frequency- Inverse Document Frequency (TF-IDF) was also carried out in the research this as a method of weighting words which then combined with the extraction of Latent features Dirichlet Allocation (LDA) as a method of modeling topics that can overcome the shortcomings of SVM. This research produced the best performance on a combination of TF-IDF and LDA, with 240 topics has 29792 features, which is 82.16%.\",\"PeriodicalId\":113108,\"journal\":{\"name\":\"2020 8th International Conference on Information and Communication Technology (ICoICT)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 8th International Conference on Information and Communication Technology (ICoICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICoICT49345.2020.9166397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 8th International Conference on Information and Communication Technology (ICoICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICoICT49345.2020.9166397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

看电影是减少无聊的活动之一,因此有必要寻找有关电影的信息,这些信息以影评的形式包装,以确定是否考虑观看这部电影。但是,在通过影评搜索信息时,由于影评人的评论很多,因此存在一定的障碍。因此,情感分析是必要的,目的是将电影评论分为积极情绪和消极情绪。机器学习方法可以用作情感分析分类,因为它可以产生最好的性能,这种方法称为支持向量机(SVM)。这也是SVM分类用于影评数据情感分析的原因。研究中还采用词频特征提取-逆文档频率(TF-IDF)作为对词进行加权的方法,然后结合潜在特征提取Dirichlet Allocation (LDA)作为主题建模的方法,克服了支持向量机的缺点。本研究在TF-IDF和LDA结合的情况下表现最好,240个主题有29792个特征,占82.16%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
On The Feature Extraction For Sentiment Analysis of Movie Reviews Based on SVM
Watching a movie is one of the activities that reduce bored, so it is necessary to look for information about the movie, which is packaged in the form of a movie review to determine whether the movie considered for viewing or no. However, in searching for information through movie reviews, there are obstacles because there are many reviews conducted by reviewers. Therefore, sentiment analysis is needed aims to classify the movie review into positive and negative sentiments. Machine learning methods can use as a sentiment analysis classification because that can produce the best performance, the method called Support Vector Machine (SVM). That was a reason SVM classification used in sentiment analysis on movie review data. Use feature extraction of Term Frequency- Inverse Document Frequency (TF-IDF) was also carried out in the research this as a method of weighting words which then combined with the extraction of Latent features Dirichlet Allocation (LDA) as a method of modeling topics that can overcome the shortcomings of SVM. This research produced the best performance on a combination of TF-IDF and LDA, with 240 topics has 29792 features, which is 82.16%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信