{"title":"Recurrent Attentive Decomposition Network for Low-Light Image Enhancement","authors":"Haoyu Gao, Lin Zhang, Shunli Zhang","doi":"10.1109/ICIP46576.2022.9897342","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897342","url":null,"abstract":"This paper aims to solve the problems of Low-light image enhancement based on classical method RetinexNet. Given the problems of original results with lots of noise and color distortion, this paper proposes a novel recurrent attentive decomposition network, which combines spatial attention mechanism and Encoder-Decoder structure to better capture the key information of images and make a thorough image decomposition process. Furthermore, another network based on attention mechanism is added to denoise the reflection image and improve the restoration effect of image details. Compared with RetinexNet and other popular methods, the overall style of images processed by our method is more consistent with that of the real scene. Both visual comparison and quantity comparison of Structural Similarity(SSIM) and Peak Signal to Noise Ratio(PSNR) demonstrate that our method is with superiority to several state-of-the-art methods.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132973434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Yolo-SG: Salience-Guided Detection Of Small Objects In Medical Images","authors":"Rong Han, Xiaohong Liu, Ting Chen","doi":"10.1109/ICIP46576.2022.9898077","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9898077","url":null,"abstract":"Object detection, a crucial component of medical image analysis, provides physicians with an interpretable auxiliary diagnostic basis. Although existing object detection models have had great success with natural images, the growing resolution of medical images makes the problem especially challenging because of the increased expectations to exploit the image details and discover small targets in images. For instance, lesions are occasionally diminutive relative to high-resolution medical images. To address this problem, we present YOLO-SG, a salience-guided (SG) deep learning model that improves small object detection by attending to detailed regions via a generated salience map. YOLO-SG performs two rounds of detection: coarse detection and salience-guided detection. In the first round of coarse detection, YOLO-SG detects objects using a deep convolutional detection model and proposes a salience map utilizing the context surrounding objects to guide the subsequent round of detection. In the second round, YOLO-SG extracts salient regions from the original input image based on the generated salience map and combines local detail with global context information to improve the object detection performance. The experimental results demonstrate that YOLO-SG outperforms the state-of-the-art models, especially when detecting small objects.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132791364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mario Viti, H. Talbot, B. Abdallah, E. Perot, N. Gogin
{"title":"Coronary Artery Centerline Tracking with the Morphological Skeleton Loss","authors":"Mario Viti, H. Talbot, B. Abdallah, E. Perot, N. Gogin","doi":"10.1109/ICIP46576.2022.9897385","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897385","url":null,"abstract":"Coronary computed tomography angiography (CCTA) provides a non-invasive imaging solution that reliably depicts the anatomy of coronary arteries. Diagnosing coronary artery diseases (CAD) entails a clinical evaluation of stenosis and plaques, which is in turn essential for obtaining a reliable coronary-artery centerline from CCTA 3D imaging. This work proposes a centerline extraction algorithm by combining local semantic segmentation and recursive tracking. To this end we propose a Morphological Skeleton Loss (MS_Loss) suited for 3D centerline segmentation based on an improved morphological skeleton algorithm coupled with a resource-efficient back-propagation scheme. This work employs 225 CCTA examinations paired with manually annotated coronary-artery centerlines. This method is compared against the deep-learning state of the art in the literature using a standardized evaluation method for coronary-artery tracking.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130967232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuhu Feng, Keisuke Maeda, Takahiro Ogawa, M. Haseyama
{"title":"Human-Centric Image Retrieval with Gaze-Based Image Captioning","authors":"Yuhu Feng, Keisuke Maeda, Takahiro Ogawa, M. Haseyama","doi":"10.1109/ICIP46576.2022.9897949","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897949","url":null,"abstract":"This paper presents human-centric image retrieval with gaze-based image captioning. Although the development of cross-modal embedding techniques has enabled advanced image retrieval, many methods have focused only on the information obtained from the contents such as image and text. For further extending the image retrieval, it is necessary to construct retrieval techniques that directly reflect human intentions. In this paper, we propose a new retrieval approach via image captioning based on gaze information by focusing on the fact that the gaze information obtained from humans contains semantic information. Specifically, we construct a transformer, connect caption and gaze trace (CGT) model that learns the relationship among images, captioning provided by humans and gaze traces. Our CGT model enables transformer-based learning by dividing the gaze traces into several bounding boxes, and thus, gaze-based image captioning becomes feasible. By using the obtained captioning for cross-modal retrieval, we can achieve human-centric image retrieval. The technical contribution of this paper is transforming the gaze trace into the captioning via the transformer-based encoder. In the experiments, by comparing the cross-modal embedding method, the effectiveness of the proposed method is proved.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131005523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical Training for Distributed Deep Learning Based on Multimedia Data over Band-Limited Networks","authors":"Siyu Qi, Lahiru D. Chamain, Zhi Ding","doi":"10.1109/ICIP46576.2022.9897383","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897383","url":null,"abstract":"Distributed deep learning (DL) plays a critical role in many wireless Internet of Things (IoT) applications including remote camera deployment. This work addresses three practical challenges in cyber-deployment of distributed DL over band-limited channels. Specifically, many IoT systems consist of sensor nodes for raw data collection and encoding, and servers for learning and inference tasks. Adaptation of DL over band-limited network data links has only been scantly addressed. A second challenge is the need for pre-deployed encoders being compatible with flexible decoders that can be upgraded or retrained. The third challenge is the robustness against erroneous training labels. Addressing these three challenges, we develop a hierarchical learning strategy to improve image classification accuracy over band-limited links between sensor nodes and servers. Experimental results show that our hierarchically-trained models can improve link spectrum efficiency without performance loss, reduce storage and computational complexity, and achieve robustness against training label corruption.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133486730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Subjective Quality Evaluation of Point Clouds with 3D Stereoscopic Visualization","authors":"João Prazeres, Manuela Pereira, A. Pinheiro","doi":"10.1109/ICIP46576.2022.9897937","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897937","url":null,"abstract":"In this paper, a subjective evaluation of static point clouds encoded with several codecs is described. Unlike other studies, a stereoscopic 3D display was used to visualize the 3D representation. A set of six point clouds were encoded using a set of state of the art point cloud coding solutions, notably the two MPEG codecs V-PCC and G-PCC, a deep learning solution RS-DLPCC that was the response to a call for evidence on point cloud coding of JPEG Pleno, and the popular DRACO codec. The results of this subjective quality evaluation using a 3D representation visualized in a stereoscopic display were compared with a previous subjective study that used the same content visualized in a 2D display. For that, the results of both tests were compared with the Pearson correlation, Spearman rank order correlation, the root mean square error and the outlier ratio. Moreover, the two subjective evaluation results were statistically analysed to seek for any statistical difference. The two subjective evaluations reveal a very high level of similarity.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133625393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Model Quantization on the Resilience Against Membership Inference Attacks","authors":"C. Kowalski, Azadeh Famili, Yingjie Lao","doi":"10.1109/ICIP46576.2022.9897681","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897681","url":null,"abstract":"As neural networks get deeper and more computationally intensive, model quantization has emerged as a promising compression tool offering lower computational costs with limited performance degradation, enabling deployment on edge devices. Meanwhile, recent studies have shown that neural network models are vulnerable to various security and privacy threats. Among these, membership inference attacks (MIAs) are capable of breaching user privacy by identifying training data from neural network models. This paper investigates the impact of model quantization on the resistance of neural networks against MIA through empirical studies. We demonstrate that quantized models are less likely to leak private information of training data than their full precision counterparts. Our experimental results show that the precision MIA attack on quantized models is 7 to 9 points lower than their counterparts when the recall is the same. To the best of our knowledge, this paper is the first work to study the implication of model quantization on the resistance of neural network models against MIA.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133663461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hatef Otroshi-Shahreza, Vedrana Krivokuća Hahn, S. Marcel
{"title":"Face Reconstruction from Deep Facial Embeddings using a Convolutional Neural Network","authors":"Hatef Otroshi-Shahreza, Vedrana Krivokuća Hahn, S. Marcel","doi":"10.1109/ICIP46576.2022.9897535","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897535","url":null,"abstract":"State-of-the-art (SOTA) face recognition systems generally use deep convolutional neural networks (CNNs) to extract deep features, called embeddings, from face images. The face embeddings are stored in the system’s database and are used for recognition of the enrolled system users. Hence, these features convey important information about the user’s identity, and therefore any attack using the face embeddings jeopardizes the user’s security and privacy. In this paper, we propose a CNN-based structure to reconstruct face images from face embeddings and we train our network with a multi-term loss function. In our experiments, our network is trained to reconstruct face images from SOTA face recognition models (ArcFace and ElasticFace) and we evaluate our face reconstruction network on the MOBIO and LFW datasets. The source code of all the experiments presented in this paper is publicly available so our work can be fully reproduced.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132386370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep Neural Network-Based Noisy Pixel Estimation for Breast Ultrasound Segmentation","authors":"Songbai Jin, Wen-kai Lu, P. Monkam","doi":"10.1109/ICIP46576.2022.9898006","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9898006","url":null,"abstract":"The success of modern deep learning algorithms for image segmentation heavily relies on the availability of high-quality labels for training. However, obtaining accurate labels is time-consuming and tedious, and requires expertise. If directly trained with dataset with noisy annotations, networks can easily overfit to noisy labels and result in poor performance, which might lead to serious misinterpretation. To this end, we propose a noisy pixel estimation approach based on deep neural network, which helps correct the noisy annotations resulting in better prediction performance. First, a deep neural network is trained to detect noisy pixels from image annotations. Then, the estimated noisy pixels are used to correct the noisy annotations. Finally, the corrected annotations are used to train the deep learning model. Our proposed framework is validated on the breast tumor segmentation task. The obtained experimental results show that our proposed method can improve the robustness of deep learning model under noisy annotations while achieving favorable performance against existing noisy label correction methods.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132429326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Downsampling Based Light Field Video Coding with Restoration Network Using Joint Spatio-Angular and Epipolar Information","authors":"V. V. Duong, T. N. Huu, Jonghoon Yim, B. Jeon","doi":"10.1109/ICIP46576.2022.9897948","DOIUrl":"https://doi.org/10.1109/ICIP46576.2022.9897948","url":null,"abstract":"This paper proposes a new downsampling-based light field video coding (D-LFVC) framework whose success relies on how to design an effective restoration method that can remove artifacts brought by both downsampling and compression. Since light field (LF) video is of high dimensionality data, the restoration methods designed for conventional 2D video are sub-optimal solutions for our D-LFVC. In this regard, we design a new restoration network, named \"LF-QEN,\" for our D-LFVC framework. Specifically, the network contains three different feature extractor modules, allowing us to simultaneously exploit information from different kinds of 4D LF representation: spatial, angular, and epipolar image information. Our experimental results show that, compared to compression by HEVC-SCC standard, the proposed framework can obtain not only nearly 50% bitrate savings but also can significantly enhance the quality of decoded LF video.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133123008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}