{"title":"作为强大对手的扩散模型","authors":"Xuelong Dai;Yanjie Li;Mingxing Duan;Bin Xiao","doi":"10.1109/TIP.2024.3514361","DOIUrl":null,"url":null,"abstract":"Diffusion models have demonstrated their great ability to generate high-quality images for various tasks. With such a strong performance, diffusion models can potentially pose a severe threat to both humans and deep learning models, e.g., DNNs and MLLMs. However, their abilities as adversaries have not been well explored. Among different adversarial scenarios, the no-box adversarial attack is the most practical one, as it assumes that the attacker has no access to the training dataset or the target model. Existing works still require some data from the training dataset, which may not be feasible in real-world scenarios. In this paper, we investigate the adversarial capabilities of diffusion models by conducting no-box attacks solely using data generated by diffusion models. Specifically, our attack method generates a synthetic dataset using diffusion models to train a substitute model. We then employ a classification diffusion model to fine-tune the substitute model, considering model uncertainty and incorporating noise augmentation. Finally, we sample adversarial examples from the diffusion models using the average approximation over the diffusion substitute model with multiple inferences. Extensive experiments on the ImageNet dataset demonstrate that the proposed attack method achieves state-of-the-art performance in both no-box attack and black-box attack scenarios.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6734-6747"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Diffusion Models as Strong Adversaries\",\"authors\":\"Xuelong Dai;Yanjie Li;Mingxing Duan;Bin Xiao\",\"doi\":\"10.1109/TIP.2024.3514361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Diffusion models have demonstrated their great ability to generate high-quality images for various tasks. With such a strong performance, diffusion models can potentially pose a severe threat to both humans and deep learning models, e.g., DNNs and MLLMs. However, their abilities as adversaries have not been well explored. Among different adversarial scenarios, the no-box adversarial attack is the most practical one, as it assumes that the attacker has no access to the training dataset or the target model. Existing works still require some data from the training dataset, which may not be feasible in real-world scenarios. In this paper, we investigate the adversarial capabilities of diffusion models by conducting no-box attacks solely using data generated by diffusion models. Specifically, our attack method generates a synthetic dataset using diffusion models to train a substitute model. We then employ a classification diffusion model to fine-tune the substitute model, considering model uncertainty and incorporating noise augmentation. Finally, we sample adversarial examples from the diffusion models using the average approximation over the diffusion substitute model with multiple inferences. Extensive experiments on the ImageNet dataset demonstrate that the proposed attack method achieves state-of-the-art performance in both no-box attack and black-box attack scenarios.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"33 \",\"pages\":\"6734-6747\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10804100/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10804100/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Diffusion models have demonstrated their great ability to generate high-quality images for various tasks. With such a strong performance, diffusion models can potentially pose a severe threat to both humans and deep learning models, e.g., DNNs and MLLMs. However, their abilities as adversaries have not been well explored. Among different adversarial scenarios, the no-box adversarial attack is the most practical one, as it assumes that the attacker has no access to the training dataset or the target model. Existing works still require some data from the training dataset, which may not be feasible in real-world scenarios. In this paper, we investigate the adversarial capabilities of diffusion models by conducting no-box attacks solely using data generated by diffusion models. Specifically, our attack method generates a synthetic dataset using diffusion models to train a substitute model. We then employ a classification diffusion model to fine-tune the substitute model, considering model uncertainty and incorporating noise augmentation. Finally, we sample adversarial examples from the diffusion models using the average approximation over the diffusion substitute model with multiple inferences. Extensive experiments on the ImageNet dataset demonstrate that the proposed attack method achieves state-of-the-art performance in both no-box attack and black-box attack scenarios.