基于Twitter文本信息的多标签事件分类的深度学习方法

Sherly Rosa Anggraeni, Narandha Arya Ranggianto, I. Ghozali, C. Fatichah, D. Purwitasari
{"title":"基于Twitter文本信息的多标签事件分类的深度学习方法","authors":"Sherly Rosa Anggraeni, Narandha Arya Ranggianto, I. Ghozali, C. Fatichah, D. Purwitasari","doi":"10.20473/jisebi.8.1.31-41","DOIUrl":null,"url":null,"abstract":"Background: Twitter is one of the most used social media, with 310 million active users monthly and 500 million tweets per day. Twitter is not only used to talk about trending topics but also to share information about accidents, fires, traffic jams, etc. People often find these updates useful to minimize the impact.\nObjective: The current study compares the effectiveness of three deep learning methods (CNN, RCNN, CLSTM) combined with neuroNER in classifying multi-label incidents.\nMethods: NeuroNER is paired with different deep learning classification methods (CNN, RCNN, CLSTM).\nResults: CNN paired with NeuroNER yield the best results for multi-label classification compared to CLSTM and RCNN.\nConclusion: CNN was proven to be more effective with an average precision value of 88.54% for multi-label incidents classification. This is because the data we used for the classification resulted from NER, which was in the form of entity labels. CNN immediately distinguishes important information, namely the NER labels. CLSTM generates the worst result because it is more suitable for sequential data. Future research will benefit from changing the classification parameters and test scenarios on a different number of labels with more diverse data.\nKeywords: CLSTM, CNN, Incident Classification, Multi-label Classification, RCNN","PeriodicalId":16185,"journal":{"name":"Journal of Information Systems Engineering and Business Intelligence","volume":"90 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Deep Learning Approaches for Multi-Label Incidents Classification from Twitter Textual Information\",\"authors\":\"Sherly Rosa Anggraeni, Narandha Arya Ranggianto, I. Ghozali, C. Fatichah, D. Purwitasari\",\"doi\":\"10.20473/jisebi.8.1.31-41\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Twitter is one of the most used social media, with 310 million active users monthly and 500 million tweets per day. Twitter is not only used to talk about trending topics but also to share information about accidents, fires, traffic jams, etc. People often find these updates useful to minimize the impact.\\nObjective: The current study compares the effectiveness of three deep learning methods (CNN, RCNN, CLSTM) combined with neuroNER in classifying multi-label incidents.\\nMethods: NeuroNER is paired with different deep learning classification methods (CNN, RCNN, CLSTM).\\nResults: CNN paired with NeuroNER yield the best results for multi-label classification compared to CLSTM and RCNN.\\nConclusion: CNN was proven to be more effective with an average precision value of 88.54% for multi-label incidents classification. This is because the data we used for the classification resulted from NER, which was in the form of entity labels. CNN immediately distinguishes important information, namely the NER labels. CLSTM generates the worst result because it is more suitable for sequential data. Future research will benefit from changing the classification parameters and test scenarios on a different number of labels with more diverse data.\\nKeywords: CLSTM, CNN, Incident Classification, Multi-label Classification, RCNN\",\"PeriodicalId\":16185,\"journal\":{\"name\":\"Journal of Information Systems Engineering and Business Intelligence\",\"volume\":\"90 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Systems Engineering and Business Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.20473/jisebi.8.1.31-41\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Systems Engineering and Business Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20473/jisebi.8.1.31-41","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

背景:Twitter是最常用的社交媒体之一,每月有3.1亿活跃用户,每天有5亿条推文。Twitter不仅用于讨论热门话题,还用于分享有关事故、火灾、交通堵塞等信息。人们经常发现这些更新有助于减少影响。目的:比较CNN、RCNN、CLSTM三种深度学习方法结合neuroNER对多标签事件进行分类的有效性。方法:将NeuroNER与不同的深度学习分类方法(CNN、RCNN、CLSTM)配对。结果:与CLSTM和RCNN相比,CNN与NeuroNER配对在多标签分类方面的效果最好。结论:CNN对多标签事件分类的平均准确率为88.54%,具有较好的分类效果。这是因为我们用于分类的数据来自NER,它是以实体标签的形式出现的。CNN立即区分重要信息,即NER标签。CLSTM产生的结果最差,因为它更适合于顺序数据。未来的研究将受益于改变分类参数和在不同数量的标签上使用更多样化的数据的测试场景。关键词:CLSTM, CNN,事件分类,多标签分类,RCNN
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Deep Learning Approaches for Multi-Label Incidents Classification from Twitter Textual Information
Background: Twitter is one of the most used social media, with 310 million active users monthly and 500 million tweets per day. Twitter is not only used to talk about trending topics but also to share information about accidents, fires, traffic jams, etc. People often find these updates useful to minimize the impact. Objective: The current study compares the effectiveness of three deep learning methods (CNN, RCNN, CLSTM) combined with neuroNER in classifying multi-label incidents. Methods: NeuroNER is paired with different deep learning classification methods (CNN, RCNN, CLSTM). Results: CNN paired with NeuroNER yield the best results for multi-label classification compared to CLSTM and RCNN. Conclusion: CNN was proven to be more effective with an average precision value of 88.54% for multi-label incidents classification. This is because the data we used for the classification resulted from NER, which was in the form of entity labels. CNN immediately distinguishes important information, namely the NER labels. CLSTM generates the worst result because it is more suitable for sequential data. Future research will benefit from changing the classification parameters and test scenarios on a different number of labels with more diverse data. Keywords: CLSTM, CNN, Incident Classification, Multi-label Classification, RCNN
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
0.30
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信