情感分析的内容与语境:对微博的比较分析

HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media Pub Date : 2012-06-25 DOI:10.1145/2309996.2310028

F. Aisopos, G. Papadakis, K. Tserpes, T. Varvarigou

{"title":"情感分析的内容与语境:对微博的比较分析","authors":"F. Aisopos, G. Papadakis, K. Tserpes, T. Varvarigou","doi":"10.1145/2309996.2310028","DOIUrl":null,"url":null,"abstract":"Microblog content poses serious challenges to the applicability of traditional sentiment analysis and classification methods, due to its inherent characteristics. To tackle them, we introduce a method that relies on two orthogonal, but complementary sources of evidence: content-based features captured by n-gram graphs and context-based ones captured by polarity ratio. Both are language-neutral and noise-tolerant, guaranteeing high effectiveness and robustness in the settings we are considering. To ensure our approach can be integrated into practical applications with large volumes of data, we also aim at enhancing its time efficiency: we propose alternative sets of features with low extraction cost, explore dimensionality reduction and discretization techniques and experiment with multiple classification algorithms. We then evaluate our methods over a large, real-world data set extracted from Twitter, with the outcomes indicating significant improvements over the traditional techniques.","PeriodicalId":91270,"journal":{"name":"HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media","volume":"103 1","pages":"187-196"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"67","resultStr":"{\"title\":\"Content vs. context for sentiment analysis: a comparative analysis over microblogs\",\"authors\":\"F. Aisopos, G. Papadakis, K. Tserpes, T. Varvarigou\",\"doi\":\"10.1145/2309996.2310028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microblog content poses serious challenges to the applicability of traditional sentiment analysis and classification methods, due to its inherent characteristics. To tackle them, we introduce a method that relies on two orthogonal, but complementary sources of evidence: content-based features captured by n-gram graphs and context-based ones captured by polarity ratio. Both are language-neutral and noise-tolerant, guaranteeing high effectiveness and robustness in the settings we are considering. To ensure our approach can be integrated into practical applications with large volumes of data, we also aim at enhancing its time efficiency: we propose alternative sets of features with low extraction cost, explore dimensionality reduction and discretization techniques and experiment with multiple classification algorithms. We then evaluate our methods over a large, real-world data set extracted from Twitter, with the outcomes indicating significant improvements over the traditional techniques.\",\"PeriodicalId\":91270,\"journal\":{\"name\":\"HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media\",\"volume\":\"103 1\",\"pages\":\"187-196\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"67\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2309996.2310028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2309996.2310028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 67

摘要

微博内容由于其固有的特点，对传统情感分析和分类方法的适用性提出了严峻的挑战。为了解决这些问题，我们引入了一种方法，该方法依赖于两个正交但互补的证据来源:由n图捕获的基于内容的特征和由极性比捕获的基于上下文的特征。两者都是语言中立和耐噪声的，保证了我们正在考虑的设置的高效率和鲁棒性。为了确保我们的方法可以集成到大量数据的实际应用中，我们还致力于提高其时间效率:我们提出了低提取成本的替代特征集，探索降维和离散化技术，并试验了多种分类算法。然后，我们通过从Twitter提取的大型真实数据集评估我们的方法，结果表明比传统技术有了显着改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Content vs. context for sentiment analysis: a comparative analysis over microblogs

Microblog content poses serious challenges to the applicability of traditional sentiment analysis and classification methods, due to its inherent characteristics. To tackle them, we introduce a method that relies on two orthogonal, but complementary sources of evidence: content-based features captured by n-gram graphs and context-based ones captured by polarity ratio. Both are language-neutral and noise-tolerant, guaranteeing high effectiveness and robustness in the settings we are considering. To ensure our approach can be integrated into practical applications with large volumes of data, we also aim at enhancing its time efficiency: we propose alternative sets of features with low extraction cost, explore dimensionality reduction and discretization techniques and experiment with multiple classification algorithms. We then evaluate our methods over a large, real-world data set extracted from Twitter, with the outcomes indicating significant improvements over the traditional techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

HT ... : the proceedings of the ... ACM Conference on Hypertext and Social Media. ACM Conference on Hypertext and Social Media

自引率

0.00%

发文量