{"title":"增强仇恨言论检测能力:利用语言的丰富性和深度学习","authors":"Gde Bagus, Janardana Abasan, Erwin Budi Setiawan","doi":"10.11591/eei.v13i2.6938","DOIUrl":null,"url":null,"abstract":"Social media has become a vital part of most modern human personal life. Twitter is one of the social media that was formed from the development of communication technology. A lot of social media gives users the freedom to express themselves. This facility is misused by users, so hate speech is spread. Designing a system to detect hate speech intelligently is needed. This study uses the hybrid deep learning (HDL) and solo deep learning (SDL) approach with the convolutional neural networks (CNN) and bidirectional gated recurrent unit (Bi-GRU) algorithm. There are 4 models built, namely CNN, Bi-GRU, CNN+Bi-GRU, and Bi-GRU+CNN. Term frequency-inverse document frequency (TF-IDF) is used for feature extraction, which is to get linguistic features to be analyzed and studied. FastText is used to perform feature expansion to minimize mismatched vocabulary. Four scenarios are run. CNN with an accuracy of 87.63%, Bi-GRU produces an accuracy of 87.46%, CNN+Bi-GRU provides an accuracy of 87.47% and Bi-GRU+CNN provides an accuracy of 87.34%. The ability of this approach to understand the context is qualified. HDL outperforms SDL in terms of n-gram type, where HDL can understand sentences broken down by hybrid n-gram types, namely Unigram-Bigram-Trigram which is a complex n-gram hybrid.","PeriodicalId":37619,"journal":{"name":"Bulletin of Electrical Engineering and Informatics","volume":"12 12","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Empowering hate speech detection: leveraging linguistic richness and deep learning\",\"authors\":\"Gde Bagus, Janardana Abasan, Erwin Budi Setiawan\",\"doi\":\"10.11591/eei.v13i2.6938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media has become a vital part of most modern human personal life. Twitter is one of the social media that was formed from the development of communication technology. A lot of social media gives users the freedom to express themselves. This facility is misused by users, so hate speech is spread. Designing a system to detect hate speech intelligently is needed. This study uses the hybrid deep learning (HDL) and solo deep learning (SDL) approach with the convolutional neural networks (CNN) and bidirectional gated recurrent unit (Bi-GRU) algorithm. There are 4 models built, namely CNN, Bi-GRU, CNN+Bi-GRU, and Bi-GRU+CNN. Term frequency-inverse document frequency (TF-IDF) is used for feature extraction, which is to get linguistic features to be analyzed and studied. FastText is used to perform feature expansion to minimize mismatched vocabulary. Four scenarios are run. CNN with an accuracy of 87.63%, Bi-GRU produces an accuracy of 87.46%, CNN+Bi-GRU provides an accuracy of 87.47% and Bi-GRU+CNN provides an accuracy of 87.34%. The ability of this approach to understand the context is qualified. HDL outperforms SDL in terms of n-gram type, where HDL can understand sentences broken down by hybrid n-gram types, namely Unigram-Bigram-Trigram which is a complex n-gram hybrid.\",\"PeriodicalId\":37619,\"journal\":{\"name\":\"Bulletin of Electrical Engineering and Informatics\",\"volume\":\"12 12\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bulletin of Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11591/eei.v13i2.6938\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bulletin of Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/eei.v13i2.6938","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Mathematics","Score":null,"Total":0}
Empowering hate speech detection: leveraging linguistic richness and deep learning
Social media has become a vital part of most modern human personal life. Twitter is one of the social media that was formed from the development of communication technology. A lot of social media gives users the freedom to express themselves. This facility is misused by users, so hate speech is spread. Designing a system to detect hate speech intelligently is needed. This study uses the hybrid deep learning (HDL) and solo deep learning (SDL) approach with the convolutional neural networks (CNN) and bidirectional gated recurrent unit (Bi-GRU) algorithm. There are 4 models built, namely CNN, Bi-GRU, CNN+Bi-GRU, and Bi-GRU+CNN. Term frequency-inverse document frequency (TF-IDF) is used for feature extraction, which is to get linguistic features to be analyzed and studied. FastText is used to perform feature expansion to minimize mismatched vocabulary. Four scenarios are run. CNN with an accuracy of 87.63%, Bi-GRU produces an accuracy of 87.46%, CNN+Bi-GRU provides an accuracy of 87.47% and Bi-GRU+CNN provides an accuracy of 87.34%. The ability of this approach to understand the context is qualified. HDL outperforms SDL in terms of n-gram type, where HDL can understand sentences broken down by hybrid n-gram types, namely Unigram-Bigram-Trigram which is a complex n-gram hybrid.
期刊介绍:
Bulletin of Electrical Engineering and Informatics publishes original papers in the field of electrical, computer and informatics engineering which covers, but not limited to, the following scope: Computer Science, Computer Engineering and Informatics[...] Electronics[...] Electrical and Power Engineering[...] Telecommunication and Information Technology[...]Instrumentation and Control Engineering[...]