S. Ellis, O. M. Manzanera, V. Baltatzis, Ibrahim Nawaz, A. Nair, L. L. Folgoc, S. Desai, Ben Glocker, J. Schnabel
{"title":"Evaluation of 3D GANs for Lung Tissue Modelling in Pulmonary CT","authors":"S. Ellis, O. M. Manzanera, V. Baltatzis, Ibrahim Nawaz, A. Nair, L. L. Folgoc, S. Desai, Ben Glocker, J. Schnabel","doi":"10.59275/j.melba.2022-9e4b","DOIUrl":null,"url":null,"abstract":"Generative adversarial networks (GANs) are able to model accurately the distribution of complex, high-dimensional datasets, for example images. This characteristic makes high-quality GANs useful for unsupervised anomaly detection in medical imaging. However, differences in training datasets such as output image dimensionality and appearance of semantically meaningful features mean that GAN models from the natural image processing domain may not work 'out-of-the-box' for medical imaging applications, necessitating re-implementation and re-evaluation. In this work we adapt and evaluate three GAN models to the application of modelling 3D healthy image patches for pulmonary CT. To the best of our knowledge, this is the first time that such a detailed evaluation has been performed. The deep convolutional GAN (DCGAN), styleGAN and the bigGAN architectures were selected for investigation due to their ubiquity and high performance in natural image processing. We train different variants of these methods and assess their performance using the widely used Frechet Inception Distance (FID). In addition, the quality of the generated images was evaluated by a human observer study, the ability of the networks to model 3D domain-specific features was investigated, and the structure of the GAN latent spaces was analysed. Results show that the 3D styleGAN approaches produce realistic-looking images with meaningful 3D structure, but suffer from mode collapse which must be explicitly addressed during training to obtain diversity in the samples. Conversely, the 3D DCGAN models show a greater capacity for image variability, but at the cost of poor-quality images. The 3D bigGAN models provide an intermediate level of image quality, but most accurately model the distribution of selected semantically meaningful features. The results suggest that future development is required to realise a 3D GAN with sufficient representational capacity for patch-based lung CT anomaly detection and we offer recommendations for future areas of research, such as experimenting with other architectures and incorporation of position-encoding.","PeriodicalId":75083,"journal":{"name":"The journal of machine learning for biomedical imaging","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The journal of machine learning for biomedical imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59275/j.melba.2022-9e4b","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Generative adversarial networks (GANs) are able to model accurately the distribution of complex, high-dimensional datasets, for example images. This characteristic makes high-quality GANs useful for unsupervised anomaly detection in medical imaging. However, differences in training datasets such as output image dimensionality and appearance of semantically meaningful features mean that GAN models from the natural image processing domain may not work 'out-of-the-box' for medical imaging applications, necessitating re-implementation and re-evaluation. In this work we adapt and evaluate three GAN models to the application of modelling 3D healthy image patches for pulmonary CT. To the best of our knowledge, this is the first time that such a detailed evaluation has been performed. The deep convolutional GAN (DCGAN), styleGAN and the bigGAN architectures were selected for investigation due to their ubiquity and high performance in natural image processing. We train different variants of these methods and assess their performance using the widely used Frechet Inception Distance (FID). In addition, the quality of the generated images was evaluated by a human observer study, the ability of the networks to model 3D domain-specific features was investigated, and the structure of the GAN latent spaces was analysed. Results show that the 3D styleGAN approaches produce realistic-looking images with meaningful 3D structure, but suffer from mode collapse which must be explicitly addressed during training to obtain diversity in the samples. Conversely, the 3D DCGAN models show a greater capacity for image variability, but at the cost of poor-quality images. The 3D bigGAN models provide an intermediate level of image quality, but most accurately model the distribution of selected semantically meaningful features. The results suggest that future development is required to realise a 3D GAN with sufficient representational capacity for patch-based lung CT anomaly detection and we offer recommendations for future areas of research, such as experimenting with other architectures and incorporation of position-encoding.