解决多类情感分类中类不平衡问题的实验研究

IF 1.8 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Mateusz Lango
{"title":"解决多类情感分类中类不平衡问题的实验研究","authors":"Mateusz Lango","doi":"10.2478/fcds-2019-0009","DOIUrl":null,"url":null,"abstract":"Abstract Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.","PeriodicalId":42909,"journal":{"name":"Foundations of Computing and Decision Sciences","volume":"44 1","pages":"151 - 178"},"PeriodicalIF":1.8000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2478/fcds-2019-0009","citationCount":"18","resultStr":"{\"title\":\"Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study\",\"authors\":\"Mateusz Lango\",\"doi\":\"10.2478/fcds-2019-0009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.\",\"PeriodicalId\":42909,\"journal\":{\"name\":\"Foundations of Computing and Decision Sciences\",\"volume\":\"44 1\",\"pages\":\"151 - 178\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2019-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.2478/fcds-2019-0009\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Foundations of Computing and Decision Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2478/fcds-2019-0009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations of Computing and Decision Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/fcds-2019-0009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 18

摘要

摘要情感分类是学术界和工业界广泛关注的一项重要任务。与此相关的许多问题,如否定或讽刺话语的处理,在以前的作品中都进行了分析和相应的处理。然而,类不平衡问题往往会损害学习算法的预测能力,这一问题几乎没有得到研究。在这项工作中,我们的目标是弥合不平衡学习和情绪分析之间的差距。为了分析不平衡学习方法在情感分类中的有用性,我们进行了一项实验研究,包括12种不平衡学习预处理方法、4种特征表示和十几个数据集。此外,在情感语料库中调查了不平衡学习中常见的数据困难因素,以评估阶级不平衡的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study
Abstract Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Foundations of Computing and Decision Sciences
Foundations of Computing and Decision Sciences COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
2.20
自引率
9.10%
发文量
16
审稿时长
29 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信