{"title":"中文在线评论的情感分类:分析和改进监督机器学习","authors":"P. Yin, Hongwei Wang, Lijuan Zheng","doi":"10.1504/IJWET.2012.050968","DOIUrl":null,"url":null,"abstract":"With the boost of online reviews, a large quantity of consumers' opinions on certain products and services are generated and spread over the internet, thus techniques of sentiment classification for online reviews rise in response to the requirement of retrieving valuable information. This paper is mainly focused on improving sentiment classification of Chinese online reviews through analysing and improving each step in supervised machine learning. At first, adjectives, adverbs, and verbs are selected as the initial text features. Then, three statistic methods (DF, IG and CHI) are utilised to extract features. At last, a Boolean method is applied to set weight to features and a support vector machine (SVM) is employed as the classifier. Several comparative experiments have been conducted on reviews of two domains: mobile phone (product) reviews and hotel (service) reviews. The experimental results indicate that part of speech (POS), the number of features, evaluation domain, feature extraction algorithm and kernel function of SVM have great influences on sentiment classification, while the number of training corpora has a little impact. In addition, further improvements of DF IG and CHI have been made, which demonstrate the theoretical significance and the practical value of this research.","PeriodicalId":396746,"journal":{"name":"Int. J. Web Eng. Technol.","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Sentiment classification of Chinese online reviews: analysing and improving supervised machine learning\",\"authors\":\"P. Yin, Hongwei Wang, Lijuan Zheng\",\"doi\":\"10.1504/IJWET.2012.050968\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the boost of online reviews, a large quantity of consumers' opinions on certain products and services are generated and spread over the internet, thus techniques of sentiment classification for online reviews rise in response to the requirement of retrieving valuable information. This paper is mainly focused on improving sentiment classification of Chinese online reviews through analysing and improving each step in supervised machine learning. At first, adjectives, adverbs, and verbs are selected as the initial text features. Then, three statistic methods (DF, IG and CHI) are utilised to extract features. At last, a Boolean method is applied to set weight to features and a support vector machine (SVM) is employed as the classifier. Several comparative experiments have been conducted on reviews of two domains: mobile phone (product) reviews and hotel (service) reviews. The experimental results indicate that part of speech (POS), the number of features, evaluation domain, feature extraction algorithm and kernel function of SVM have great influences on sentiment classification, while the number of training corpora has a little impact. In addition, further improvements of DF IG and CHI have been made, which demonstrate the theoretical significance and the practical value of this research.\",\"PeriodicalId\":396746,\"journal\":{\"name\":\"Int. J. Web Eng. Technol.\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Web Eng. Technol.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJWET.2012.050968\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Web Eng. Technol.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJWET.2012.050968","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sentiment classification of Chinese online reviews: analysing and improving supervised machine learning
With the boost of online reviews, a large quantity of consumers' opinions on certain products and services are generated and spread over the internet, thus techniques of sentiment classification for online reviews rise in response to the requirement of retrieving valuable information. This paper is mainly focused on improving sentiment classification of Chinese online reviews through analysing and improving each step in supervised machine learning. At first, adjectives, adverbs, and verbs are selected as the initial text features. Then, three statistic methods (DF, IG and CHI) are utilised to extract features. At last, a Boolean method is applied to set weight to features and a support vector machine (SVM) is employed as the classifier. Several comparative experiments have been conducted on reviews of two domains: mobile phone (product) reviews and hotel (service) reviews. The experimental results indicate that part of speech (POS), the number of features, evaluation domain, feature extraction algorithm and kernel function of SVM have great influences on sentiment classification, while the number of training corpora has a little impact. In addition, further improvements of DF IG and CHI have been made, which demonstrate the theoretical significance and the practical value of this research.