Yun Zhang, R. Liu, Yifan Pan, Dehao Wu, Yuesheng Zhu, Zhiqiang Bai
{"title":"GI-AEE: GAN Inversion Based Attentive Expression Embedding Network For Facial Expression Editing","authors":"Yun Zhang, R. Liu, Yifan Pan, Dehao Wu, Yuesheng Zhu, Zhiqiang Bai","doi":"10.1109/ICIP42928.2021.9506434","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506434","url":null,"abstract":"Facial expression editing aims to modify facial expression by specific conditions. Existing methods adopt an encoder-decoder architecture under the guidance of expression condition to process the desired expression. However, these methods always tend to produce artifacts and blurs in expression-intensive regions due to simultaneously modifying images in expression changed regions and ensuring the consistency of other attributes with the source image. To address these issues, we propose a GAN inversion based Attentive Expression Embedding Network (GI-AEE) for facial expression editing, which decouples this task utilizing GAN inversion to alleviate the strong effect of the source image on the target image and produces high-quality expression editing results. Furthermore, different from existing methods that directly embed the expression condition into the network, we propose an Attentive Expression Embedding module to embed corresponding expression vectors into different facial regions, producing more plausible results. Qualitative and quantitative experiments demonstrate our method outperforms the state-of-the-art expression editing methods.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132184178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Bishay, Ahmed Ghoneim, M. Ashraf, Mohammad Mavadati
{"title":"Choose Settings Carefully: Comparing Action Unit Detection At Different Settings Using A Large-Scale Dataset","authors":"M. Bishay, Ahmed Ghoneim, M. Ashraf, Mohammad Mavadati","doi":"10.1109/ICIP42928.2021.9506757","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506757","url":null,"abstract":"In this paper, we investigate the impact of some of the commonly used settings for (a) preprocessing face images, and (b) classification and training, on Action Unit (AU) detection performance and complexity. We use in our investigation a large-scale dataset, consisting of ~55K videos collected in the wild for participants watching commercial ads. The preprocessing settings include scaling the face to a fixed resolution, changing the color information (RGB to gray-scale), aligning the face, and cropping AU regions, while the classification and training settings include the kind of classifier (multi-label vs. binary) and the amount of data used for training models. To the best of our knowledge, no work had investigated the effect of those settings on AU detection. In our analysis we use CNNs as our baseline classification model.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130795757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Adversarial Collaborative-Learning Approach for Corneal Scar Segmentation with Ocular Anterior Segment Photography","authors":"Ke Wang, Guangyu Wang, Kang Zhang, Ting Chen","doi":"10.1109/ICIP42928.2021.9506621","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506621","url":null,"abstract":"Corneal scarring is a common eye disease that leads to reduced vision. An accurate diagnosis and segmentation of corneal scar is a critical in ensuring proper treatment. Deep neural networks have made great progress in medical image segmentation, but the training requires large amount of annotated data. Pixel-level corneal scar can only be annotated by experienced ophthalmologists, but eye structure annotation can be done easily by people with minimal medical knowledge. In this paper, we propose Dual-Eye-GAN Net (DEGNet), an end-to-end adversarial collaborative-learning corneal scar segmentation model. DEG-Net can improve segmentation quality with additional data that only has eye structure annotation. We collect the first corneal scar segmentation dataset in the form of anterior ocular photography. Experimental results demonstrate superiority to both supervised and semi-supervised approaches. This is the first empirical study on corneal scar segmentation with anterior ocular photography. The code and dataset can be found in https://github.com/kaisadadi/Dual-GAN-Net.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130978803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-Phase Multimodal Image Fusion Using Convolutional Neural Networks","authors":"Kushal Kusram, S. Transue, Min-Hyung Choi","doi":"10.1109/ICIP42928.2021.9506703","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506703","url":null,"abstract":"The fusion of multiple imaging modalities presents an important contribution to machine vision, but remains an ongoing challenge due to the limitations in traditional calibration methods that perform a single, global alignment. For depth and thermal imaging devices, sensor and lens intrinsics (FOV, resolution, etc.) may vary considerably, making per-pixel fusion accuracy difficult. In this paper, we present AccuFusion, a two-phase non-linear registration method to fuse multimodal images at a per-pixel level to obtain an efficient and accurate image registration. The two phases: the Coarse Fusion Network (CFN) and Refining Fusion Network (RFN), are designed to learn a robust image-space fusion that provides a non-linear mapping for accurate alignment. By employing the refinement process, we obtain per-pixel displacements to minimize local alignment errors and observe an increase of 18% in average accuracy over global registration.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131678460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Gradient Leakage To Adversarial Attacks In Federated Learning","authors":"Jia Qi Lim, Chee Seng Chan","doi":"10.1109/ICIP42928.2021.9506589","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506589","url":null,"abstract":"Deep neural networks (DNN) are widely used in real-life applications despite the lack of understanding on this technology and its challenges. Data privacy is one of the bottlenecks that is yet to be overcome and more challenges in DNN arise when researchers start to pay more attention to DNN vulnerabilities. In this work, we aim to cast the doubts towards the reliability of the DNN with solid evidence particularly in Federated Learning environment by utilizing an existing privacy breaking algorithm which inverts gradients of models to reconstruct the input data. By performing the attack algorithm, we exemplify the data reconstructed from inverting gradients algorithm as a potential threat and further reveal the vulnerabilities of models in representation learning. Pytorch implementation are provided at https://github.com/Jiaqi0602/adversarial-attack-from-leakage/","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128823260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Homayun Afrabandpey, Anton Muravevy, H. R. Tavakoli, Honglei Zhang, Francesco Cricri, M. Gabbouj, Emre B. Aksu
{"title":"Mind The Structure: Adopting Structural Information For Deep Neural Network Compression","authors":"Homayun Afrabandpey, Anton Muravevy, H. R. Tavakoli, Honglei Zhang, Francesco Cricri, M. Gabbouj, Emre B. Aksu","doi":"10.1109/ICIP42928.2021.9506102","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506102","url":null,"abstract":"Deep neural networks have huge number of parameters and require large number of bits for representation. This hinders their adoption in decentralized environments where model transfer among different parties is a characteristic of the environment while the communication bandwidth is limited. Parameter quantization is a compression approach to address this challenge by reducing the number of bits required to represent a model, e.g. a neural network. However, majority of existing neural network quantization methods do not exploit structural information of layers and parameters during quantization. In this paper, focusing on Convolutional Neural Networks (CNNs), we present a novel quantization approach by employing the structural information of neural network layers and their corresponding parameters. Starting from a pre-trained CNN, we categorize network parameters into different groups based on the similarity of their layers and their spatial structure. Parameters of each group are independently clustered and the centroid of each cluster is used as representative for all parameters in the cluster. Finally, the centroids and the cluster indexes of the parameters are used as a compact representation of the parameters. Experiments with two different tasks, i.e., acoustic scene classification and image compression, demonstrate the effectiveness of the proposed approach.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125394251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuojie Wen, Bo Peng, Hao Jiang, Junkai Cao, Jingfeng Jiang
{"title":"Augmenting 3D Ultrasound Strain Elastography by combining Bayesian inference with local Polynomial fitting in Region-growing-based Motion Tracking","authors":"Shuojie Wen, Bo Peng, Hao Jiang, Junkai Cao, Jingfeng Jiang","doi":"10.1109/ICIP42928.2021.9506520","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506520","url":null,"abstract":"Accurately tracking large tissue motion over a sequence of ultrasound images is critically important to several clinical applications including, but not limited to, elastography, flow imaging, and ultrasound-guided motion compensation. However, tracking in vivo large tissue deformation in 3D is a challenging problem and requires further developments. In this study, we explore a novel tracking strategy that combines Bayesian inference with local polynomial fitting. Since this strategy is incorporated into a region-growing block-matching motion tracking framework we call this strategy a Bayesian region-growing motion tracking with local polynomial fitting (BRGMTLPF) algorithm. More specifically, unlike a conventional block-matching algorithm, we use a maximum posterior probability density function to determine the “correct” three-dimensional displacement vector.The proposed BRGMT-LPF algorithm was evaluated using a tissue-mimicking phantom and ultrasound data acquired from a pathologically-confirmed human breast tumor. The in vivo ultrasound data was acquired using a 3D whole breast ultrasound scanner, while the tissue-mimicking phantom was acquired using an experimental CMUT ultrasound transducer. To demonstrate the effectiveness of combining Bayesian inference with local Polynomial fitting, the proposed method was compared to the original region-growing motion tracking algorithm (RGMT), region-growing with Bayesian interference only (BRGMT), and region-growing with local polynomial fitting (RGMT-LPF). Our preliminary data demonstrate that the proposed BRGMT-LPF algorithm can improve the accuracy of motion tracking.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125449293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gaël Mahfoudi, F. Morain-Nicolier, F. Retraint, M. Pic
{"title":"CMID: A New Dataset for Copy-Move Forgeries on ID Documents","authors":"Gaël Mahfoudi, F. Morain-Nicolier, F. Retraint, M. Pic","doi":"10.1109/ICIP42928.2021.9506723","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506723","url":null,"abstract":"Copy-Move forgery has been widely studied as it is a really common forgery. Furthermore, it is the easiest forgery to create with serious security-related threats in particular for distant remote id onboarding where company ask their customer to send a photo of their ID document. It is then easy for a counterfeit to alter the information on the document by copying and pasting letters within the photo. On the other hand, copy-move detection algorithms are known to perform worse in presence of similar but genuine objects preventing us from using them in practical situations like remote ID on boarding. In this article we propose a novel copy-move public dataset containing forged ID documents and study current state-of-the-art performances on this dataset to evaluate their potential use in practical situations.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125536783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GSLD: A Global Scanner with Local Discriminator Network for Fast Detection of Sparse Plasma Cell in Immunohistochemistry","authors":"Qi Zhang, Zhu Meng, Zhicheng Zhao, Fei Su","doi":"10.1109/ICIP42928.2021.9506782","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506782","url":null,"abstract":"Compared with abundant application of deep learning on hematoxylin and eosin (H&E) images, the study on immunohistochemical (IHC) images is almost blank, while the diagnosis of chronic endometritis mainly relies on the detection of plasma cells in IHC images. In this paper, a novel framework named Global Scanner with Local Discriminator (GSLD) is proposed to detect plasma cells with highly sparse distribution in IHC whole slide images (WSI) effectively and efficiently. Firstly, input an IHC image, the Global Scanner subnetwork (GSNet) predicts a distribution map, where the candidate plasma cells are localized quickly. Secondly, based on the distribution map, the Local Discriminator subnetwork (LDNet)discriminates true plasma cells by adopting only local information, which greatly speeds up the detection. Moreover, a novel grid-oversampling strategy for WSI preprocessing is proposed to relieve sample imbalance problem. Experimentas show that the proposed framework outperforms the representative object detection networks in both speed and accuracy.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"PP 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126684572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Satoshi Suzuki, Shoichiro Takeda, Ryuichi Tanida, H. Kimata, Hayaru Shouno
{"title":"Knowledge Transferred Fine-Tuning for Anti-Aliased Convolutional Neural Network in Data-Limited Situation","authors":"Satoshi Suzuki, Shoichiro Takeda, Ryuichi Tanida, H. Kimata, Hayaru Shouno","doi":"10.1109/ICIP42928.2021.9506696","DOIUrl":"https://doi.org/10.1109/ICIP42928.2021.9506696","url":null,"abstract":"Anti-aliased convolutional neural networks (CNNs) introduce blur filters to intermediate representations in CNNs to achieve high accuracy. A promising way to build a new antialiased CNN is to fine-tune a pre-trained CNN, which can easily be found online, with blur filters. However, blur filters drastically degrade the pre-trained representation, so the fine-tuning needs to rebuild the representation by using massive training data. Therefore, if the training data is limited, the fine-tuning cannot work well because it induces overfitting to the limited training data. To tackle this problem, this paper proposes “knowledge transferred fine-tuning”. On the basis of the idea of knowledge transfer, our method transfers the knowledge from intermediate representations in the pre-trained CNN to the anti-aliased CNN while fine-tuning. We transfer only essential knowledge using a pixel-level loss that transfers detailed knowledge and a global-level loss that transfers coarse knowledge. Experimental results demonstrate that our method significantly outperforms the simple fine-tuning method.","PeriodicalId":314429,"journal":{"name":"2021 IEEE International Conference on Image Processing (ICIP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126266837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}