Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Teddy Ferdinan, Jan Kocoń
{"title":"Fortifying NLP models against poisoning attacks: The power of personalized prediction architectures","authors":"Teddy Ferdinan,&nbsp;Jan Kocoń","doi":"10.1016/j.inffus.2024.102692","DOIUrl":null,"url":null,"abstract":"<div><p>In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":null,"pages":null},"PeriodicalIF":14.7000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1566253524004706/pdfft?md5=3a6019ed5699d3ea16b3237461a74599&pid=1-s2.0-S1566253524004706-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253524004706","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In Natural Language Processing (NLP), state-of-the-art machine learning models heavily depend on vast amounts of training data. Often, this data is sourced from third parties, such as crowdsourcing platforms, to enable swift and efficient annotation collection for supervised learning. Yet, such an approach is susceptible to poisoning attacks where malicious agents deliberately insert harmful data to skew the resulting model behavior. Current countermeasures to these attacks either come at a significant cost, lack full efficacy, or are simply non-applicable. This study introduces and evaluates the potential of personalized model architectures as a defense against these threats. By comparing two top-performing personalized model architectures, User-ID and HuBi-Medium, against a standard non-personalized baseline across two NLP tasks and various simulated attack scenarios, we found that the personalized model architectures significantly outperformed the baseline. The robustness advantage increased with the rise in malicious annotations. Notably, the User-ID model excelled in safeguarding predictions for legitimate users from the influence of malicious annotations. Our findings emphasize the benefit of adopting personalized model architectures to bolster NLP system defenses against poisoning attacks.

强化 NLP 模型,抵御中毒攻击:个性化预测架构的力量
在自然语言处理(NLP)领域,最先进的机器学习模型在很大程度上依赖于大量的训练数据。这些数据通常来自第三方,如众包平台,以便为监督学习快速、高效地收集注释。然而,这种方法很容易受到 "中毒 "攻击,即恶意代理蓄意插入有害数据,以歪曲由此产生的模型行为。目前针对这些攻击的对策要么成本高昂,要么缺乏全面的有效性,要么根本无法应用。本研究介绍并评估了个性化模型架构作为防御这些威胁的潜力。通过在两个 NLP 任务和各种模拟攻击场景中将两个表现最佳的个性化模型架构(User-ID 和 HuBi-Medium )与标准非个性化基线进行比较,我们发现个性化模型架构的表现明显优于基线。随着恶意注释的增加,鲁棒性优势也在增加。值得注意的是,User-ID 模型在保护合法用户的预测不受恶意注释影响方面表现出色。我们的研究结果强调了采用个性化模型架构来增强 NLP 系统防御中毒攻击的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信