基于多个深度神经网络的言语虐待分类

Hyunju Park, Hong Kook Kim
{"title":"基于多个深度神经网络的言语虐待分类","authors":"Hyunju Park, Hong Kook Kim","doi":"10.1109/ICAIIC51459.2021.9415218","DOIUrl":null,"url":null,"abstract":"People can be exposed to verbal abuse practically anywhere. It is considered to be one of serious issues in society. In this paper, we describe a method to classify verbal abuse into five lasses by adding a convolutional neural network (CNN), a long short-term memory, and a dense layer on top of bidirectional encoder representations from transformers (BERT). The data are collected from Korean drama, movies, and YouTube. Due to data imbalance, weighted random sampler and data augmentation are used to train the models to be generalized. Experiments show that BERT with CNN after data augmentation performs the highest accuracy among all the compared methods.","PeriodicalId":432977,"journal":{"name":"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Verbal Abuse Classification Using Multiple Deep Neural Networks\",\"authors\":\"Hyunju Park, Hong Kook Kim\",\"doi\":\"10.1109/ICAIIC51459.2021.9415218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"People can be exposed to verbal abuse practically anywhere. It is considered to be one of serious issues in society. In this paper, we describe a method to classify verbal abuse into five lasses by adding a convolutional neural network (CNN), a long short-term memory, and a dense layer on top of bidirectional encoder representations from transformers (BERT). The data are collected from Korean drama, movies, and YouTube. Due to data imbalance, weighted random sampler and data augmentation are used to train the models to be generalized. Experiments show that BERT with CNN after data augmentation performs the highest accuracy among all the compared methods.\",\"PeriodicalId\":432977,\"journal\":{\"name\":\"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAIIC51459.2021.9415218\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAIIC51459.2021.9415218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

人们几乎在任何地方都可能遭受言语虐待。它被认为是一个严重的社会问题。在本文中,我们描述了一种方法,通过在变压器(BERT)的双向编码器表示之上添加卷积神经网络(CNN)、长短期记忆和密集层,将言语虐待分为五类。这些数据是从韩剧、电影、YouTube上收集的。由于数据不平衡,采用加权随机采样器和数据增广方法训练模型进行推广。实验表明,经过数据增强后的BERT与CNN相结合的方法在所有比较方法中准确率最高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Verbal Abuse Classification Using Multiple Deep Neural Networks
People can be exposed to verbal abuse practically anywhere. It is considered to be one of serious issues in society. In this paper, we describe a method to classify verbal abuse into five lasses by adding a convolutional neural network (CNN), a long short-term memory, and a dense layer on top of bidirectional encoder representations from transformers (BERT). The data are collected from Korean drama, movies, and YouTube. Due to data imbalance, weighted random sampler and data augmentation are used to train the models to be generalized. Experiments show that BERT with CNN after data augmentation performs the highest accuracy among all the compared methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信