OpinionIt: a text mining system for cross-lingual opinion analysis

Proceedings of the 19th ACM international conference on Information and knowledge management Pub Date : 2010-10-26 DOI:10.1145/1871437.1871589

Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su

{"title":"OpinionIt: a text mining system for cross-lingual opinion analysis","authors":"Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su","doi":"10.1145/1871437.1871589","DOIUrl":null,"url":null,"abstract":"Opinion mining focuses on extracting customers' opinions from the reviews and predicting their sentiment orientation. Reviewers usually praise a product in some aspects and bemoan it in other aspects. With the business globalization, it is very important for enterprises to extract the opinions toward different aspects and find out cross-lingual/cross-culture difference in opinions. Cross-lingual opinion mining is a very challenging task as amounts of opinions are written in different languages, and not well structured. Since people usually use different words to describe the same aspect in the reviews, product-feature (PF) categorization becomes very critical in cross-lingual opinion mining. Manual cross-lingual PF categorization is time consuming, and practically infeasible for the massive amount of data written in different languages. In order to effectively find out cross-lingual difference in opinions, we present an aspect-oriented opinion mining method with Cross-lingual Latent Semantic Association (CLaSA). We first construct CLaSA model to learn the cross-lingual latent semantic association among all the PFs from multi-dimension semantic clues in the review corpus. Then we employ CLaSA model to categorize all the multilingual PFs into semantic aspects, and summarize cross-lingual difference in opinions towards different aspects. Experimental results show that our method achieves better performance compared with the existing approaches. With CLaSA model, our text mining system OpinionIt can effectively discover cross-lingual difference in opinions.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM international conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1871437.1871589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 28

Abstract

Opinion mining focuses on extracting customers' opinions from the reviews and predicting their sentiment orientation. Reviewers usually praise a product in some aspects and bemoan it in other aspects. With the business globalization, it is very important for enterprises to extract the opinions toward different aspects and find out cross-lingual/cross-culture difference in opinions. Cross-lingual opinion mining is a very challenging task as amounts of opinions are written in different languages, and not well structured. Since people usually use different words to describe the same aspect in the reviews, product-feature (PF) categorization becomes very critical in cross-lingual opinion mining. Manual cross-lingual PF categorization is time consuming, and practically infeasible for the massive amount of data written in different languages. In order to effectively find out cross-lingual difference in opinions, we present an aspect-oriented opinion mining method with Cross-lingual Latent Semantic Association (CLaSA). We first construct CLaSA model to learn the cross-lingual latent semantic association among all the PFs from multi-dimension semantic clues in the review corpus. Then we employ CLaSA model to categorize all the multilingual PFs into semantic aspects, and summarize cross-lingual difference in opinions towards different aspects. Experimental results show that our method achieves better performance compared with the existing approaches. With CLaSA model, our text mining system OpinionIt can effectively discover cross-lingual difference in opinions.

查看原文本刊更多论文

OpinionIt:一个用于跨语言意见分析的文本挖掘系统

意见挖掘侧重于从评论中提取客户意见并预测其情绪倾向。评论者通常会在某些方面赞扬产品，而在另一些方面抱怨产品。随着商业全球化的发展，企业对不同方面的意见进行提炼，发现跨语言/跨文化的意见差异是非常重要的。跨语言意见挖掘是一项非常具有挑战性的任务，因为大量的意见是用不同的语言写成的，而且没有很好的结构。由于人们通常在评论中使用不同的词来描述同一方面，因此产品特征分类在跨语言意见挖掘中变得非常关键。手动跨语言PF分类非常耗时，而且对于用不同语言编写的大量数据实际上是不可行的。为了有效地发现跨语言意见差异，提出了一种基于跨语言潜在语义关联(CLaSA)的面向方面意见挖掘方法。我们首先构建CLaSA模型，从评审语料库中的多维语义线索中学习所有评审词之间的跨语言潜在语义关联。然后利用CLaSA模型对多语种PFs进行语义方面的分类，并对不同方面的跨语种意见差异进行归纳。实验结果表明，与现有方法相比，该方法具有更好的性能。利用CLaSA模型，我们的文本挖掘系统OpinionIt可以有效地发现跨语言的意见差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 19th ACM international conference on Information and knowledge management

自引率

0.00%

发文量