Three hybrid classifiers for the detection of emotions in suicide notes.

Biomedical informatics insights Pub Date : 2012-01-01 Epub Date: 2012-01-30 DOI:10.4137/BII.S8967
Maria Liakata, Jee-Hyub Kim, Shyamasree Saha, Janna Hastings, Dietrich Rebholz-Schuhmann
{"title":"Three hybrid classifiers for the detection of emotions in suicide notes.","authors":"Maria Liakata,&nbsp;Jee-Hyub Kim,&nbsp;Shyamasree Saha,&nbsp;Janna Hastings,&nbsp;Dietrich Rebholz-Schuhmann","doi":"10.4137/BII.S8967","DOIUrl":null,"url":null,"abstract":"<p><p>We describe our approach for creating a system able to detect emotions in suicide notes. Motivated by the sparse and imbalanced data as well as the complex annotation scheme, we have considered three hybrid approaches for distinguishing between the different categories. Each of the three approaches combines machine learning with manually derived rules, where the latter target very sparse emotion categories. The first approach considers the task as single label multi-class classification, where an SVM and a CRF classifier are trained to recognise fifteen different categories and their results are combined. Our second approach trains individual binary classifiers (SVM and CRF) for each of the fifteen sentence categories and returns the union of the classifiers as the final result. Finally, our third approach is a combination of binary and multi-class classifiers (SVM and CRF) trained on different subsets of the training data. We considered a number of different feature configurations. All three systems were tested on 300 unseen messages. Our second system had the best performance of the three, yielding an F1 score of 45.6% and a Precision of 60.1% whereas our best Recall (43.6%) was obtained using the third system.</p>","PeriodicalId":88397,"journal":{"name":"Biomedical informatics insights","volume":"5 Suppl. 1","pages":"175-84"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.4137/BII.S8967","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical informatics insights","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4137/BII.S8967","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2012/1/30 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

Abstract

We describe our approach for creating a system able to detect emotions in suicide notes. Motivated by the sparse and imbalanced data as well as the complex annotation scheme, we have considered three hybrid approaches for distinguishing between the different categories. Each of the three approaches combines machine learning with manually derived rules, where the latter target very sparse emotion categories. The first approach considers the task as single label multi-class classification, where an SVM and a CRF classifier are trained to recognise fifteen different categories and their results are combined. Our second approach trains individual binary classifiers (SVM and CRF) for each of the fifteen sentence categories and returns the union of the classifiers as the final result. Finally, our third approach is a combination of binary and multi-class classifiers (SVM and CRF) trained on different subsets of the training data. We considered a number of different feature configurations. All three systems were tested on 300 unseen messages. Our second system had the best performance of the three, yielding an F1 score of 45.6% and a Precision of 60.1% whereas our best Recall (43.6%) was obtained using the third system.

Abstract Image

三种用于检测遗书情绪的混合分类器。
我们描述了创建一个能够检测遗书中情绪的系统的方法。由于数据的稀疏性和不平衡性以及标注方案的复杂性,我们考虑了三种混合方法来区分不同的类别。这三种方法中的每一种都将机器学习与人工导出的规则相结合,后者的目标是非常稀疏的情感类别。第一种方法将任务视为单标签多类分类,其中SVM和CRF分类器被训练以识别15种不同的类别,并将它们的结果结合起来。我们的第二种方法是为15个句子类别中的每一个训练单独的二元分类器(SVM和CRF),并返回分类器的联合作为最终结果。最后,我们的第三种方法是在训练数据的不同子集上训练的二元分类器和多类分类器(SVM和CRF)的组合。我们考虑了许多不同的特性配置。所有三个系统都测试了300条看不见的消息。我们的第二个系统在三个系统中表现最好,F1得分为45.6%,Precision为60.1%,而我们的最佳召回率(43.6%)是使用第三个系统获得的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信