{"title":"基于相关性特征选择和词频逆文档频率的支持向量机算法在情感分析中的实现","authors":"Novia Puji Ririanti, A. Purwinarko","doi":"10.15294/sji.v8i2.29992","DOIUrl":null,"url":null,"abstract":"Purpose: The study aims to reduce the number of irrelevant features in sentiment analysis with large features. Methods/Study design/approach: The Support Vector Machine (SVM) algorithm is used to classify hotel review sentiment analysis because it has advantages in processing large datasets. Term Frequency-Inverse Document Frequency (TF-IDF) is used to give weight values to features in the dataset. Result/Findings: This study's results indicate that the accuracy of the SVM method with TF-IDF produces an accuracy of 93.14%, and the SVM method in the classification of hotel reviews by implementing TFIDF and CFS has increased by 1.18% from 93.14% to 94.32%. Novelty/Originality/Value: Use of Correlation-Based Feature Section (CFS) for the feature selection process, which reduces the number of irrelevant features by ranking the feature subset based on the strong correlation value in each feature","PeriodicalId":30781,"journal":{"name":"Scientific Journal of Informatics","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Implementation of Support Vector Machine Algorithm with Correlation-Based Feature Selection and Term Frequency Inverse Document Frequency for Sentiment Analysis Review Hotel\",\"authors\":\"Novia Puji Ririanti, A. Purwinarko\",\"doi\":\"10.15294/sji.v8i2.29992\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Purpose: The study aims to reduce the number of irrelevant features in sentiment analysis with large features. Methods/Study design/approach: The Support Vector Machine (SVM) algorithm is used to classify hotel review sentiment analysis because it has advantages in processing large datasets. Term Frequency-Inverse Document Frequency (TF-IDF) is used to give weight values to features in the dataset. Result/Findings: This study's results indicate that the accuracy of the SVM method with TF-IDF produces an accuracy of 93.14%, and the SVM method in the classification of hotel reviews by implementing TFIDF and CFS has increased by 1.18% from 93.14% to 94.32%. Novelty/Originality/Value: Use of Correlation-Based Feature Section (CFS) for the feature selection process, which reduces the number of irrelevant features by ranking the feature subset based on the strong correlation value in each feature\",\"PeriodicalId\":30781,\"journal\":{\"name\":\"Scientific Journal of Informatics\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Journal of Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15294/sji.v8i2.29992\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Journal of Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15294/sji.v8i2.29992","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementation of Support Vector Machine Algorithm with Correlation-Based Feature Selection and Term Frequency Inverse Document Frequency for Sentiment Analysis Review Hotel
Purpose: The study aims to reduce the number of irrelevant features in sentiment analysis with large features. Methods/Study design/approach: The Support Vector Machine (SVM) algorithm is used to classify hotel review sentiment analysis because it has advantages in processing large datasets. Term Frequency-Inverse Document Frequency (TF-IDF) is used to give weight values to features in the dataset. Result/Findings: This study's results indicate that the accuracy of the SVM method with TF-IDF produces an accuracy of 93.14%, and the SVM method in the classification of hotel reviews by implementing TFIDF and CFS has increased by 1.18% from 93.14% to 94.32%. Novelty/Originality/Value: Use of Correlation-Based Feature Section (CFS) for the feature selection process, which reduces the number of irrelevant features by ranking the feature subset based on the strong correlation value in each feature