Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi
{"title":"支持向量机算法在自然灾害信息分类中的性能新特征分析","authors":"Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi","doi":"10.1109/ic2ie53219.2021.9649107","DOIUrl":null,"url":null,"abstract":"When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.","PeriodicalId":178443,"journal":{"name":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages\",\"authors\":\"Khairisyah Yuliani Firlia, M. Reza Faisal, Dwi Kartini, Radityo Adi Nugroho, Friska Abadi\",\"doi\":\"10.1109/ic2ie53219.2021.9649107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.\",\"PeriodicalId\":178443,\"journal\":{\"name\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ic2ie53219.2021.9649107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference of Computer and Informatics Engineering (IC2IE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ic2ie53219.2021.9649107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis of New Features on the Performance of the Support Vector Machine Algorithm in Classification of Natural Disaster Messages
When a natural disaster occurs, Twitter is one social media people use to give their opinion. The classification of natural disaster messages on Twitter has been widely used in research to determine messages from direct eyewitnesses. This message is crucial because it can be used to determine the location and time of the incident. One of the essential parts in the classification of natural disaster messages is feature extraction. The feature extraction technique commonly used is n-gram with TF-IDF weighting. In the research, we use structured data generated by n-gram and TF-IDF with three additional new features: word count, the presence of images, and URLs in tweets. The classification method used is the Support Vector Machine method multiclass with Kernel Gaussian Radial Basis Function. The results of this research are: the accuracy of the features generated by n-gram and TFIDF is 75.43%. The accuracy of the added features of the three new features is 77.50%. These results conclude that the three new features that we use can improve natural disaster message classification performance.