医学应用生成模型潜在空间中的隐私保护行走

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention Pub Date : 2023-07-06 DOI:10.48550/arXiv.2307.02984

M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato

{"title":"医学应用生成模型潜在空间中的隐私保护行走","authors":"M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato","doi":"10.48550/arXiv.2307.02984","DOIUrl":null,"url":null,"abstract":"Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.","PeriodicalId":18289,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"195 1","pages":"422-431"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications\",\"authors\":\"M. Pennisi, Federica Proietto Salanitri, G. Bellitto, S. Palazzo, Ulas Bagci, C. Spampinato\",\"doi\":\"10.48550/arXiv.2307.02984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.\",\"PeriodicalId\":18289,\"journal\":{\"name\":\"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention\",\"volume\":\"195 1\",\"pages\":\"422-431\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2307.02984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2307.02984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

生成对抗网络(GANs)已经证明了它们生成符合目标分布的合成样本的能力。然而，从隐私的角度来看，使用gan作为数据共享的代理并不是一个安全的解决方案，因为它们倾向于在潜在空间中嵌入接近重复的真实样本。最近的作品受到k-匿名原则的启发，通过潜在空间中的样本聚合来解决这个问题，缺点是将数据集减少了k个因子。我们的工作旨在通过提出一种潜在空间导航策略来缓解这个问题，该策略能够生成多种合成样本，这些样本可以支持深度模型的有效训练，同时以原则性的方式解决隐私问题。我们的方法利用辅助身份分类器作为潜在空间中点之间非线性行走的指南，最大限度地减少与真实样本的近重复碰撞的风险。我们的经验证明，给定潜在空间中的任意随机点对，我们的行走策略比线性插值更安全。然后，我们测试了结合k-same方法的寻路策略，并在结核病和糖尿病视网膜病变分类的两个基准上证明，使用我们的方法生成的样本训练模型可以减轻性能下降，同时保持隐私保护。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Privacy-Preserving Walk in the Latent Space of Generative Models for Medical Applications

Generative Adversarial Networks (GANs) have demonstrated their ability to generate synthetic samples that match a target distribution. However, from a privacy perspective, using GANs as a proxy for data sharing is not a safe solution, as they tend to embed near-duplicates of real samples in the latent space. Recent works, inspired by k-anonymity principles, address this issue through sample aggregation in the latent space, with the drawback of reducing the dataset by a factor of k. Our work aims to mitigate this problem by proposing a latent space navigation strategy able to generate diverse synthetic samples that may support effective training of deep models, while addressing privacy concerns in a principled way. Our approach leverages an auxiliary identity classifier as a guide to non-linearly walk between points in the latent space, minimizing the risk of collision with near-duplicates of real samples. We empirically demonstrate that, given any random pair of points in the latent space, our walking strategy is safer than linear interpolation. We then test our path-finding strategy combined to k-same methods and demonstrate, on two benchmarks for tuberculosis and diabetic retinopathy classification, that training a model using samples generated by our approach mitigate drops in performance, while keeping privacy preservation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention

自引率

0.00%

发文量