Homophone-aware offensive language detection via semantic-phonetic collaboration

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-09-18 DOI:10.1016/j.eswa.2025.129756

Jiahao Hu, Shanliang Pan

{"title":"Homophone-aware offensive language detection via semantic-phonetic collaboration","authors":"Jiahao Hu, Shanliang Pan","doi":"10.1016/j.eswa.2025.129756","DOIUrl":null,"url":null,"abstract":"<div><div>The increasing use of implicit and obfuscated expressions poses significant challenges to offensive language detection in Chinese online platforms. In particular, users often exploit homophone substitutions to bypass keyword-based moderation, making traditional detection systems inadequate. This study addresses the problem of detecting offensive content masked through homophonic substitutions, which retain aggressive intent while altering character representations. Existing methods fall into two main categories: (1) semantic-only models, which struggle with phonetic manipulations due to their reliance on text features alone, and (2) auxiliary-enhanced models, which incorporate phonetic or syntactic signals but lack deep integration between modalities. To overcome these limitations, we propose a lightweight dual-branch model that separately encodes textual semantics and pinyin phonetics under a multi-view learning framework. A Dual-Branch Interactive Training strategy is introduced to enable dynamic cross-modal alignment via contrastive objectives, allowing each modality to mutually refine the other and enhance robustness to adversarial inputs. We conduct experiments on two benchmark datasets, COLD and SWSR, both of which are augmented with varying levels of homophone noise to simulate real-world evasion strategies. The proposed model outperforms all baseline models, achieving an average F1-score improvement of 6.3 % under high-noise conditions, while reducing inference latency and memory usage by more than 60 %, demonstrating both effectiveness and efficiency for real-time deployment. We will release the source code for further use by the community<span><span>https://github.com/hjhhlc/DBIT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129756"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425033718","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The increasing use of implicit and obfuscated expressions poses significant challenges to offensive language detection in Chinese online platforms. In particular, users often exploit homophone substitutions to bypass keyword-based moderation, making traditional detection systems inadequate. This study addresses the problem of detecting offensive content masked through homophonic substitutions, which retain aggressive intent while altering character representations. Existing methods fall into two main categories: (1) semantic-only models, which struggle with phonetic manipulations due to their reliance on text features alone, and (2) auxiliary-enhanced models, which incorporate phonetic or syntactic signals but lack deep integration between modalities. To overcome these limitations, we propose a lightweight dual-branch model that separately encodes textual semantics and pinyin phonetics under a multi-view learning framework. A Dual-Branch Interactive Training strategy is introduced to enable dynamic cross-modal alignment via contrastive objectives, allowing each modality to mutually refine the other and enhance robustness to adversarial inputs. We conduct experiments on two benchmark datasets, COLD and SWSR, both of which are augmented with varying levels of homophone noise to simulate real-world evasion strategies. The proposed model outperforms all baseline models, achieving an average F1-score improvement of 6.3 % under high-noise conditions, while reducing inference latency and memory usage by more than 60 %, demonstrating both effectiveness and efficiency for real-time deployment. We will release the source code for further use by the communityhttps://github.com/hjhhlc/DBIT.

Abstract Image

查看原文本刊更多论文

基于语义-语音协同的同音感知攻击性语言检测

越来越多的隐式和模糊表达对中国网络平台的攻击性语言检测提出了重大挑战。特别是，用户经常利用同音字替换来绕过基于关键字的审核，使传统的检测系统不足。本研究解决了通过同音替换掩盖的攻击性内容的检测问题，这些替换在改变字符表示的同时保留了攻击性意图。现有的方法主要分为两大类：(1)纯语义模型，由于其仅依赖文本特征而难以处理语音操作；(2)辅助增强模型，它包含语音或句法信号，但缺乏模态之间的深度集成。为了克服这些限制，我们提出了一个轻量级的双分支模型，该模型在多视图学习框架下分别编码文本语义和拼音语音。引入了双分支交互式训练策略，通过对比目标实现动态跨模态对齐，允许每个模态相互改进并增强对抗性输入的鲁棒性。我们在两个基准数据集上进行了实验，COLD和SWSR，这两个数据集都增加了不同程度的同音噪声来模拟现实世界的逃避策略。所提出的模型优于所有基线模型，在高噪声条件下平均f1分数提高6.3%，同时减少推理延迟和内存使用超过60%，证明了实时部署的有效性和效率。我们将发布源代码供社区进一步使用https://github.com/hjhhlc/DBIT。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.