短文档情感分类的无监督特征学习

J. Lang. Technol. Comput. Linguistics Pub Date : 2014-07-01 DOI:10.21248/jlcl.29.2014.180

S. Albertini, Alessandro Zamberletti, I. Gallo

{"title":"短文档情感分类的无监督特征学习","authors":"S. Albertini, Alessandro Zamberletti, I. Gallo","doi":"10.21248/jlcl.29.2014.180","DOIUrl":null,"url":null,"abstract":"The rapid growth of Web information led to an increasing amount of user-generated content, such as customer reviews of products, forum posts and blogs. In this paper we face the task of assigning a sentiment polarity to user-generated short documents to determine whether each of them communicates a positive or negative judgment about a subject. The method we propose exploits a Growing Hierarchical SelfOrganizing Map to obtain a sparse encoding of user-generated content. The encoded documents are subsequently given as input to a Support Vector Machine classifier that assigns them a polarity label. Unlike other works on opinion mining, our model does not use a priori hypotheses involving special words, phrases or language constructs typical of certain domains. Using a dataset composed by customer reviews of products, the experimental results we obtain are close to those achieved by other recent works.","PeriodicalId":402489,"journal":{"name":"J. Lang. Technol. Comput. Linguistics","volume":"38 11","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Unsupervised feature learning for sentiment classification of short documents\",\"authors\":\"S. Albertini, Alessandro Zamberletti, I. Gallo\",\"doi\":\"10.21248/jlcl.29.2014.180\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapid growth of Web information led to an increasing amount of user-generated content, such as customer reviews of products, forum posts and blogs. In this paper we face the task of assigning a sentiment polarity to user-generated short documents to determine whether each of them communicates a positive or negative judgment about a subject. The method we propose exploits a Growing Hierarchical SelfOrganizing Map to obtain a sparse encoding of user-generated content. The encoded documents are subsequently given as input to a Support Vector Machine classifier that assigns them a polarity label. Unlike other works on opinion mining, our model does not use a priori hypotheses involving special words, phrases or language constructs typical of certain domains. Using a dataset composed by customer reviews of products, the experimental results we obtain are close to those achieved by other recent works.\",\"PeriodicalId\":402489,\"journal\":{\"name\":\"J. Lang. Technol. Comput. Linguistics\",\"volume\":\"38 11\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"J. Lang. Technol. Comput. Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21248/jlcl.29.2014.180\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Lang. Technol. Comput. Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21248/jlcl.29.2014.180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

Web信息的快速增长导致用户生成内容的数量不断增加，例如客户对产品的评论、论坛帖子和博客。在本文中，我们面临的任务是为用户生成的短文档分配情感极性，以确定每个文档是否传达了对主题的积极或消极判断。我们提出的方法利用增长层次自组织映射来获得用户生成内容的稀疏编码。编码后的文档随后作为输入输入给支持向量机分类器，该分类器为其分配极性标签。与其他意见挖掘工作不同，我们的模型不使用涉及特定领域典型的特殊单词、短语或语言结构的先验假设。使用由客户评论组成的数据集，我们获得的实验结果与其他近期工作的结果接近。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unsupervised feature learning for sentiment classification of short documents

The rapid growth of Web information led to an increasing amount of user-generated content, such as customer reviews of products, forum posts and blogs. In this paper we face the task of assigning a sentiment polarity to user-generated short documents to determine whether each of them communicates a positive or negative judgment about a subject. The method we propose exploits a Growing Hierarchical SelfOrganizing Map to obtain a sparse encoding of user-generated content. The encoded documents are subsequently given as input to a Support Vector Machine classifier that assigns them a polarity label. Unlike other works on opinion mining, our model does not use a priori hypotheses involving special words, phrases or language constructs typical of certain domains. Using a dataset composed by customer reviews of products, the experimental results we obtain are close to those achieved by other recent works.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

J. Lang. Technol. Comput. Linguistics

自引率

0.00%

发文量