Bootstrapping both Product Properties and Opinion Words from Chinese Reviews with Cross-Training

IEEE/WIC/ACM International Conference on Web Intelligence (WI'07) Pub Date : 2007-11-02 DOI:10.1109/WI.2007.32

Bo Wang, Houfeng Wang

引用次数: 27

Abstract

We investigate the problem of identifying both product properties and opinion words for sentences in a unified process when only a much small labeled corpus is available. Naive Bayesian method is used in this process. Specifically, considering the fact that product properties and opinion words usually co-occur with high frequency in product review articles, a cross- training method is proposed to bootstrap both of them, in which the two sub-tasks are boosted by each other iteratively. Experiment results show that with a much small labeled corpus cross-training could produce both product properties and opinion words which are very close to what Naive Bayesian Classifiers could do with a large labeled corpus..

查看原文本刊更多论文

用交叉训练的方法从中文评论中引导产品属性和评论词

我们研究了在一个统一的过程中，当只有一个非常小的标记语料库可用时，识别句子的产品属性和意见词的问题。在此过程中使用朴素贝叶斯方法。具体而言，考虑到产品属性和意见词在产品评论文章中同时出现的频率较高，提出了一种同时引导产品属性和意见词的交叉训练方法，其中两个子任务相互迭代提升。实验结果表明，在一个非常小的标记语料库中，交叉训练可以产生产品属性和意见词，这与朴素贝叶斯分类器在一个大的标记语料库中所能做的非常接近。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE/WIC/ACM International Conference on Web Intelligence (WI'07)

自引率

0.00%

发文量