Geigh Zollicoffer, Minh Vu, Ben Nebgen, Juan Castorena, Boian Alexandrov, Manish Bhattarai
{"title":"LoRID:逆向纯化的低链迭代扩散","authors":"Geigh Zollicoffer, Minh Vu, Ben Nebgen, Juan Castorena, Boian Alexandrov, Manish Bhattarai","doi":"arxiv-2409.08255","DOIUrl":null,"url":null,"abstract":"This work presents an information-theoretic examination of diffusion-based\npurification methods, the state-of-the-art adversarial defenses that utilize\ndiffusion models to remove malicious perturbations in adversarial examples. By\ntheoretically characterizing the inherent purification errors associated with\nthe Markov-based diffusion purifications, we introduce LoRID, a novel Low-Rank\nIterative Diffusion purification method designed to remove adversarial\nperturbation with low intrinsic purification errors. LoRID centers around a\nmulti-stage purification process that leverages multiple rounds of\ndiffusion-denoising loops at the early time-steps of the diffusion models, and\nthe integration of Tucker decomposition, an extension of matrix factorization,\nto remove adversarial noise at high-noise regimes. Consequently, LoRID\nincreases the effective diffusion time-steps and overcomes strong adversarial\nattacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ,\nand ImageNet datasets under both white-box and black-box settings.","PeriodicalId":501301,"journal":{"name":"arXiv - CS - Machine Learning","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LoRID: Low-Rank Iterative Diffusion for Adversarial Purification\",\"authors\":\"Geigh Zollicoffer, Minh Vu, Ben Nebgen, Juan Castorena, Boian Alexandrov, Manish Bhattarai\",\"doi\":\"arxiv-2409.08255\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This work presents an information-theoretic examination of diffusion-based\\npurification methods, the state-of-the-art adversarial defenses that utilize\\ndiffusion models to remove malicious perturbations in adversarial examples. By\\ntheoretically characterizing the inherent purification errors associated with\\nthe Markov-based diffusion purifications, we introduce LoRID, a novel Low-Rank\\nIterative Diffusion purification method designed to remove adversarial\\nperturbation with low intrinsic purification errors. LoRID centers around a\\nmulti-stage purification process that leverages multiple rounds of\\ndiffusion-denoising loops at the early time-steps of the diffusion models, and\\nthe integration of Tucker decomposition, an extension of matrix factorization,\\nto remove adversarial noise at high-noise regimes. Consequently, LoRID\\nincreases the effective diffusion time-steps and overcomes strong adversarial\\nattacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ,\\nand ImageNet datasets under both white-box and black-box settings.\",\"PeriodicalId\":501301,\"journal\":{\"name\":\"arXiv - CS - Machine Learning\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.08255\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08255","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
LoRID: Low-Rank Iterative Diffusion for Adversarial Purification
This work presents an information-theoretic examination of diffusion-based
purification methods, the state-of-the-art adversarial defenses that utilize
diffusion models to remove malicious perturbations in adversarial examples. By
theoretically characterizing the inherent purification errors associated with
the Markov-based diffusion purifications, we introduce LoRID, a novel Low-Rank
Iterative Diffusion purification method designed to remove adversarial
perturbation with low intrinsic purification errors. LoRID centers around a
multi-stage purification process that leverages multiple rounds of
diffusion-denoising loops at the early time-steps of the diffusion models, and
the integration of Tucker decomposition, an extension of matrix factorization,
to remove adversarial noise at high-noise regimes. Consequently, LoRID
increases the effective diffusion time-steps and overcomes strong adversarial
attacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ,
and ImageNet datasets under both white-box and black-box settings.