支持向量机算法在自然灾害信息分类中的性能新特征分析

Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi
{"title":"支持向量机算法在自然灾害信息分类中的性能新特征分析","authors":"Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi","doi":"10.1109/ic2ie53219.2021.9649107","DOIUrl":null,"url":null,"abstract":"When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.","PeriodicalId":178443,"journal":{"name":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages\",\"authors\":\"Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi\",\"doi\":\"10.1109/ic2ie53219.2021.9649107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.\",\"PeriodicalId\":178443,\"journal\":{\"name\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ic2ie53219.2021.9649107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ic2ie53219.2021.9649107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

当自然灾害发生时,Twitter是人们用来发表意见的社交媒体之一。Twitter上的自然灾害信息分类已被广泛用于确定直接目击者信息的研究。这条信息至关重要,因为它可以用来确定事件发生的地点和时间。自然灾害信息分类的关键环节之一是特征提取。常用的特征提取技术是n-gram加上TF-IDF加权。在这项研究中,我们使用了由n-gram和TF-IDF生成的结构化数据,并添加了三个额外的新功能:单词计数、图像的存在和tweet中的url。采用核高斯径向基函数的支持向量机多类分类方法。本研究的结果是:n-gram和TFIDF生成的特征准确率为75.43%。三个新特征的添加特征的准确率为77.50%。这些结果表明,我们使用的三个新特征可以提高自然灾害消息分类性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages
When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信