{"title":"Extracting emotion topics from blog sentences: use of voting from multi-engine supervised classifiers","authors":"Dipankar Das, Sivaji Bandyopadhyay","doi":"10.1145/1871985.1872004","DOIUrl":null,"url":null,"abstract":"This paper presents a supervised multi-engine classifier approach followed by voting to identify emotion topic(s) from English blog sentences. Manual annotation of the English blog sentences in the training set has shown a satisfactory agreement with kappa (κ) measure of 0.85 and MASI (Measure of Agreement on Set-valued Items) measure of 0.82 for emotion topic spans. The baseline system based on object related dependency relations includes the topic oriented thematic roles present in the verb based syntactic frame of the sentences. In contrast, the supervised approach consists of three classifiers, Conditional Random Field (CRF), Support Vector Machine (SVM) and a Fuzzy Classifier (FC). The important features are incorporated based on the ablation study of all features and Information Gain Based Pruning (IGBP) on the development set. One or more emotion topics associated with focused target span are identified based on the majority voting of the classifiers. The supervised multi-engine classifier system has been evaluated with average F-scores of 70.51% and 90.44% for emotion topic and target span identification respectively on 500 gold standard test sentences and has outperformed the baseline system.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SMUC '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1871985.1872004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
This paper presents a supervised multi-engine classifier approach followed by voting to identify emotion topic(s) from English blog sentences. Manual annotation of the English blog sentences in the training set has shown a satisfactory agreement with kappa (κ) measure of 0.85 and MASI (Measure of Agreement on Set-valued Items) measure of 0.82 for emotion topic spans. The baseline system based on object related dependency relations includes the topic oriented thematic roles present in the verb based syntactic frame of the sentences. In contrast, the supervised approach consists of three classifiers, Conditional Random Field (CRF), Support Vector Machine (SVM) and a Fuzzy Classifier (FC). The important features are incorporated based on the ablation study of all features and Information Gain Based Pruning (IGBP) on the development set. One or more emotion topics associated with focused target span are identified based on the majority voting of the classifiers. The supervised multi-engine classifier system has been evaluated with average F-scores of 70.51% and 90.44% for emotion topic and target span identification respectively on 500 gold standard test sentences and has outperformed the baseline system.
本文提出了一种监督式多引擎分类器方法,通过投票来识别英语博客句子中的情感主题。对训练集中的英语博客句子进行人工标注,结果表明情感主题跨度kappa (κ)测度为0.85,set -value Items测度MASI (measure of agreement on set -value Items)测度为0.82,具有满意的一致性。基于对象相关依赖关系的基线系统包括出现在句子动词句法框架中的面向主题的主位角色。相比之下,监督方法由三个分类器组成,条件随机场(CRF),支持向量机(SVM)和模糊分类器(FC)。通过对所有特征的消融研究和对开发集的基于信息增益的剪枝(IGBP),结合重要特征。基于分类器的多数投票,识别与焦点目标范围相关的一个或多个情感主题。在500个金标准测试句子上,有监督多引擎分类器系统在情感主题和目标跨度识别上的平均f值分别为70.51%和90.44%,优于基线系统。