Alexandra-Georgiana Andrei , Mihai Gabriel Constantin , Mara Graziani , Henning Müller , Bogdan Ionescu
{"title":"Privacy preserving histopathological image augmentation with Conditional Generative Adversarial Networks","authors":"Alexandra-Georgiana Andrei , Mihai Gabriel Constantin , Mara Graziani , Henning Müller , Bogdan Ionescu","doi":"10.1016/j.patrec.2024.12.014","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning approaches for histopathology image processing and analysis are gaining increasing interest in the research field, and this comes with a demand to extract more information from images. Pathological datasets are relatively small mainly due to confidentiality of medical data and legal questions, data complexity and labeling costs. Typically, a large number of annotated images for different tissue subtypes are required as training samples to automate the learning algorithms. In this paper, we present a latent-to-image approach for generating synthetic images by applying a Conditional Deep Convolutional Generative Adversarial Network for generating images of human colorectal cancer and healthy tissue. We generate high-quality images of various tissue types that preserve the general structure and features of the source classes, and we investigate an important yet overlooked aspect of data generation: ensuring privacy-preserving capabilities. The quality of these images is evaluated through perceptual experiments with pathologists and the Fréchet Inception Distance (FID) metric. Using the generated data to train classifiers improved MobileNet’s accuracy by 35.36%, and also enhanced the accuracies of DenseNet, ResNet, and EfficientNet. We further validated the robustness and versatility of our model on a different dataset, yielding promising results. Additionally, we make a novel contribution by addressing security and privacy concerns in personal medical image data, ensuring that training medical images “fingerprints” are not contained in the synthetic images generated with the model we propose.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"188 ","pages":"Pages 185-192"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865524003696","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning approaches for histopathology image processing and analysis are gaining increasing interest in the research field, and this comes with a demand to extract more information from images. Pathological datasets are relatively small mainly due to confidentiality of medical data and legal questions, data complexity and labeling costs. Typically, a large number of annotated images for different tissue subtypes are required as training samples to automate the learning algorithms. In this paper, we present a latent-to-image approach for generating synthetic images by applying a Conditional Deep Convolutional Generative Adversarial Network for generating images of human colorectal cancer and healthy tissue. We generate high-quality images of various tissue types that preserve the general structure and features of the source classes, and we investigate an important yet overlooked aspect of data generation: ensuring privacy-preserving capabilities. The quality of these images is evaluated through perceptual experiments with pathologists and the Fréchet Inception Distance (FID) metric. Using the generated data to train classifiers improved MobileNet’s accuracy by 35.36%, and also enhanced the accuracies of DenseNet, ResNet, and EfficientNet. We further validated the robustness and versatility of our model on a different dataset, yielding promising results. Additionally, we make a novel contribution by addressing security and privacy concerns in personal medical image data, ensuring that training medical images “fingerprints” are not contained in the synthetic images generated with the model we propose.
期刊介绍:
Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition.
Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.