{"title":"中文评论无监督情感分类的实证研究","authors":"Zhai Zhongwu (翟忠武), Xu Hua (徐 华), Jia Peifa (贾培发)","doi":"10.1016/S1007-0214(10)70118-8","DOIUrl":null,"url":null,"abstract":"<div><p>This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons — individual and combined — are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.</p></div>","PeriodicalId":60306,"journal":{"name":"Tsinghua Science and Technology","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/S1007-0214(10)70118-8","citationCount":"12","resultStr":"{\"title\":\"An Empirical Study of Unsupervised Sentiment Classification of Chinese Reviews\",\"authors\":\"Zhai Zhongwu (翟忠武), Xu Hua (徐 华), Jia Peifa (贾培发)\",\"doi\":\"10.1016/S1007-0214(10)70118-8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons — individual and combined — are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.</p></div>\",\"PeriodicalId\":60306,\"journal\":{\"name\":\"Tsinghua Science and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2010-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1016/S1007-0214(10)70118-8\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tsinghua Science and Technology\",\"FirstCategoryId\":\"1093\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1007021410701188\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tsinghua Science and Technology","FirstCategoryId":"1093","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1007021410701188","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
An Empirical Study of Unsupervised Sentiment Classification of Chinese Reviews
This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons — individual and combined — are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.