{"title":"Classification of Abusive Thai Language Content in Social Media Using Deep Learning","authors":"Ruangsung Wanasukapunt, Suphakant Phimoltares","doi":"10.1109/JCSSE53117.2021.9493829","DOIUrl":null,"url":null,"abstract":"This paper presents binomial and multinomial models for Thai language abusive speech classification in social media. While previous similar research focused on using traditional machine learning models for binomial classification, we showed that deep learning models have better performance. Our binomial and multinomial models achieved F1 scores of 0.8510 and 0.9067, respectively. These scores were significantly better than the machine learning models’ respective best F1 scores of 0.7452 and 0.8090. While the bidirectional LSTM performed well, the DistilBERT had higher accuracy and recall. Moreover, the recall was especially higher for the “figurative” class where certain words were more likely to have different meanings depending on context.","PeriodicalId":437534,"journal":{"name":"2021 18th International Joint Conference on Computer Science and Software Engineering (JCSSE)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th International Joint Conference on Computer Science and Software Engineering (JCSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/JCSSE53117.2021.9493829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents binomial and multinomial models for Thai language abusive speech classification in social media. While previous similar research focused on using traditional machine learning models for binomial classification, we showed that deep learning models have better performance. Our binomial and multinomial models achieved F1 scores of 0.8510 and 0.9067, respectively. These scores were significantly better than the machine learning models’ respective best F1 scores of 0.7452 and 0.8090. While the bidirectional LSTM performed well, the DistilBERT had higher accuracy and recall. Moreover, the recall was especially higher for the “figurative” class where certain words were more likely to have different meanings depending on context.