{"title":"智能移动应用中仇恨言论检测系统中有机增长的仇恨言论数据集","authors":"Ahmad Alsamman, Andreas Schmitz, M. Wimmer","doi":"10.1145/3598469.3598473","DOIUrl":null,"url":null,"abstract":"The automatic detection of hate speech online poses several challenges. A top challenge is that hate speech changes its targets and its format periodically. While the lack of available training data is a general issue in many natural language processing applications, the forementioned challenge amplifies the problem especially when taking into consideration the challenge of producing well labelled datasets. Based on the concepts of quarantining hate speech and integrating a linguistics expert in a smart mobility service provided in an administrative district in Germany, this paper proposes an approach that targets improving the training dataset quantitively and qualitatively in a running smart mobility app, the SWIA app. This proactive approach provides a long-term solution for hate speech detection models that rely on labelled datasets for training. The paper also discusses technical and practical challenges unanswered by this approach.","PeriodicalId":401026,"journal":{"name":"Proceedings of the 24th Annual International Conference on Digital Government Research","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards an Organically Growing Hate Speech Dataset in Hate Speech Detection Systems in a Smart Mobility Application\",\"authors\":\"Ahmad Alsamman, Andreas Schmitz, M. Wimmer\",\"doi\":\"10.1145/3598469.3598473\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The automatic detection of hate speech online poses several challenges. A top challenge is that hate speech changes its targets and its format periodically. While the lack of available training data is a general issue in many natural language processing applications, the forementioned challenge amplifies the problem especially when taking into consideration the challenge of producing well labelled datasets. Based on the concepts of quarantining hate speech and integrating a linguistics expert in a smart mobility service provided in an administrative district in Germany, this paper proposes an approach that targets improving the training dataset quantitively and qualitatively in a running smart mobility app, the SWIA app. This proactive approach provides a long-term solution for hate speech detection models that rely on labelled datasets for training. The paper also discusses technical and practical challenges unanswered by this approach.\",\"PeriodicalId\":401026,\"journal\":{\"name\":\"Proceedings of the 24th Annual International Conference on Digital Government Research\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 24th Annual International Conference on Digital Government Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3598469.3598473\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th Annual International Conference on Digital Government Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3598469.3598473","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards an Organically Growing Hate Speech Dataset in Hate Speech Detection Systems in a Smart Mobility Application
The automatic detection of hate speech online poses several challenges. A top challenge is that hate speech changes its targets and its format periodically. While the lack of available training data is a general issue in many natural language processing applications, the forementioned challenge amplifies the problem especially when taking into consideration the challenge of producing well labelled datasets. Based on the concepts of quarantining hate speech and integrating a linguistics expert in a smart mobility service provided in an administrative district in Germany, this paper proposes an approach that targets improving the training dataset quantitively and qualitatively in a running smart mobility app, the SWIA app. This proactive approach provides a long-term solution for hate speech detection models that rely on labelled datasets for training. The paper also discusses technical and practical challenges unanswered by this approach.