{"title":"Ensuring safety in digital spaces: Detecting code-mixed hate speech in social media posts","authors":"Pradeep Kumar Roy , Abhinav Kumar","doi":"10.1016/j.datak.2025.102409","DOIUrl":null,"url":null,"abstract":"<div><div>Social networks strive to offer positive content to users, yet a considerable amount of inappropriate material, such as rumors, fake news, and hate speech, persists. Despite significant efforts to detect and prevent hate speech early, it remains widespread due to issues like misspellings and mixed language in posts. To address these challenges, this research utilizes advanced algorithms like CNN, LSTM, and BERT to develop an automated system for detecting hate speech in Telugu-English code-mixed posts. Additionally, evaluating the effectiveness of data translation and transliteration approaches for detecting hate in mixed language. Results indicate that the transliteration approach achieves the highest accuracy, with a performance of 75% accuracy, surpassing raw and translated data by 1% and 3%, respectively. The proposed system may effectively minimizes hate speech and offensive content on social media platforms, resulting in an enhanced user experience. From a managerial perspective, this research presents numerous benefits, such as improved content moderation, optimized resource allocation, data-driven decision-making, enhanced user satisfaction, strengthened reputation management, and greater scalability. These advancements underscore the potential of utilizing advanced technologies to address complex challenges in social media management.</div></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"156 ","pages":"Article 102409"},"PeriodicalIF":2.7000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X25000047","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Social networks strive to offer positive content to users, yet a considerable amount of inappropriate material, such as rumors, fake news, and hate speech, persists. Despite significant efforts to detect and prevent hate speech early, it remains widespread due to issues like misspellings and mixed language in posts. To address these challenges, this research utilizes advanced algorithms like CNN, LSTM, and BERT to develop an automated system for detecting hate speech in Telugu-English code-mixed posts. Additionally, evaluating the effectiveness of data translation and transliteration approaches for detecting hate in mixed language. Results indicate that the transliteration approach achieves the highest accuracy, with a performance of 75% accuracy, surpassing raw and translated data by 1% and 3%, respectively. The proposed system may effectively minimizes hate speech and offensive content on social media platforms, resulting in an enhanced user experience. From a managerial perspective, this research presents numerous benefits, such as improved content moderation, optimized resource allocation, data-driven decision-making, enhanced user satisfaction, strengthened reputation management, and greater scalability. These advancements underscore the potential of utilizing advanced technologies to address complex challenges in social media management.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.