{"title":"Frequency domain-based reversible adversarial attacks for privacy protection in Internet of Things","authors":"Yang Lu, Tianfeng Ma, Zilong Pang, Xiuli Chai, Zhen Chen, Zongwei Tang","doi":"10.1117/1.jei.33.4.043049","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043049","url":null,"abstract":"Images shared on social networks often contain a large amount of private information. Bad actors can use deep learning technology to analyze private information from these images, thus causing user privacy leakage. To protect the privacy of users, reversible adversarial examples (RAEs) are proposed, and they may keep malignant models from accessing the image data while ensuring that the authorized model can recover the source data. However, existing RAEs have shortcomings in imperceptibility and attack capability. We utilize frequency domain information to generate RAEs. To improve the attack capability, the RAEs are generated by discarding the discriminant information of the original class and adding specific perturbation information. For imperceptibility, we propose to embed the perturbation in the wavelet domain of the image. Also, we design low-frequency constraints to distribute the perturbations in the high-frequency region and to ensure the similarity between the original examples and RAEs. In addition, the momentum pre-processing method is proposed to ensure that the direction of the gradient is consistent in each iteration by pre-converging the gradient before the formal iteration, thus accelerating the convergence speed of the gradient, which can be applied to the generation process of RAEs to speed up the generation of RAEs. Experimental results on the ImageNet, Caltech-256, and CIFAR-10 datasets show that the proposed method exhibits the best attack capability and visual quality compared with existing RAE generation schemes. The attack success rate and peak signal-to-noise ratio exceed 99% and 42 dB, respectively. In addition, the generated RAEs demonstrate good transferability and robustness.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"44 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-world image denoising via efficient diffusion model with controllable noise generation","authors":"Cheng Yang, Cong Wang, Lijing Liang, Zhixun Su","doi":"10.1117/1.jei.33.4.043003","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043003","url":null,"abstract":"Real-world image denoising is a critical task in image processing, aiming to restore clean images from their noisy counterparts captured in natural environments. While diffusion models have demonstrated remarkable success in image generation, surpassing traditional generative models, their application to image denoising has been limited due to challenges in controlling noise generation effectively. We present a general denoising method inspired by diffusion models. Specifically, our approach employs a diffusion process with linear interpolation, enabling control of noise generation. By interpolating the intermediate noisy image between the original clean image and the corresponding real-world noisy one, our model is able to achieve controllable noise generation. Moreover, we introduce two sampling algorithms for this diffusion model: a straightforward procedure aligned with the diffusion process and an enhanced version that addresses the shortcomings of the former. Experimental results demonstrate that our proposed method, utilizing simple convolutional neural networks such as UNet, achieves denoising performance comparable to that of the transformer architecture.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"203 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141518512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-scale point pair normal encoding for local feature description and 3D object recognition","authors":"Chu’ai Zhang, Yating Wang, Qiao Wu, Jiangbin Zheng, Jiaqi Yang, Siwen Quan, Yanning Zhang","doi":"10.1117/1.jei.33.4.043005","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043005","url":null,"abstract":"Recognizing three-dimensional (3D) objects based on local feature descriptors is a highly challenging task. Existing 3D local feature descriptors rely on single-scale surface normals, which are susceptible to noise and outliers, significantly compromising their effectiveness and robustness. A multi-scale point pair normal encoding (M-POE) method for 3D object recognition is proposed. First, we introduce the M-POE descriptor, which encodes voxelized features with multi-scale normals to describe local surfaces, exhibiting strong distinctiveness and robustness against various interferences. Second, we present guided sample consensus in second-order graphs (GSAC-SOG), an extension of RANSAC that incorporates geometric constraints and reduces sampling randomness, enabling accurate estimation of the object’s six-degree-of-freedom (6-DOF) pose. Finally, a 3D object recognition method based on the M-POE descriptor is proposed. The proposed method is evaluated on five standard datasets with state-of-the-art comparisons. The results demonstrate that (1) M-POE is robust, discriminative, and efficient; (2) GSAC-SOG is robust to outliers; (3) the proposed 3D object recognition method achieves high accuracy and robustness against clutter and occlusion, with recognition rates of 99.45%, 94.21%, and 97.88% on the U3OR, Queen, and CFV datasets, respectively.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"41 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PGDIG-YOLO: a lightweight method for airport runway foreign object detection","authors":"Liushuai Zheng, Xinyu Chen, Liuchuang Zheng","doi":"10.1117/1.jei.33.4.043014","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043014","url":null,"abstract":"Aiming at the frequent misdetection and omission in the detection process of airport runway foreign object debris (FOD) and the difficulty of deploying the detection algorithm to embedded devices, we propose a lightweight FOD detection method called PGDIG-YOLO based on the improvement of YOLOv8n. First, a detection layer for detecting small-size objects is added and a large target detection layer is deleted to enhance the network’s ability to sense small-sized objects. Second, a dilation-wise residual module is introduced in the segmentation domain, and the C2FD module is proposed, which effectively solves the problem of misdetection and missed detection of FOD on airport runways. Third, the inner-WMPDIoUv3 is designed to replace the CIoU as a loss function to improve the regression accuracy of the detection frame. Finally, the model is pruned using the Group_sl method, which reduces the amount of computation, compresses the model size, and improves the model inference speed. The experimental results on the homemade dataset FOD-Z show that, compared with the benchmark model YOLOv8n, the model volume and computation of the PGDIG-YOLO network are only 6.6% and 44.4% of the original network, and the accuracy and recall are improved by 1.1% and 3.8%, respectively. Meanwhile, the mAP@0.5, mAP@0.75, and mAP@0.5:0.95 are increased to 99.1%, 93.7%, and 85.6%, respectively. Deploying PGDIG-YOLO to the NVIDIA Jetson Xavier NX 16 GB embedded device, the detection speed reaches 42 FPS, which can realize real-time FOD detection.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"34 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141584768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yiwen Xiong, Yang Yang, Lanling Zeng, Xinyu Wang, Zhigeng Pan, Lei Jiang
{"title":"Deep unsupervised nonconvex optimization for edge-preserving image smoothing","authors":"Yiwen Xiong, Yang Yang, Lanling Zeng, Xinyu Wang, Zhigeng Pan, Lei Jiang","doi":"10.1117/1.jei.33.4.043001","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043001","url":null,"abstract":"Edge-preserving image smoothing plays a vital role in the field of computational imaging. It is a valuable technique that has applications in various tasks. However, different tasks have specific requirements for edge preservation. Existing filters do not take into account the task-dependent smoothing behavior, resulting in visually distracting artifacts. We propose a flexible edge-preserving image filter based on a nonconvex Welsch penalty. Compared with the convex models, our model can better handle complex data and capture nonlinear relationships, thus providing better results. We combine deep unsupervised learning and graduated nonconvexity to solve our nonconvex objective function, where the main network structure is designed as a Swin transformer complemented with the locally enhanced feed-forward network. Experimental results show that the proposed method achieves excellent performance in various applications, including image smoothing, high dynamic range tone mapping, detail enhancement, and edge extraction.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"111 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141530025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Azzedine Bensaad, Khaled Loukhaoukha, Said Sadoudi, Aissa Snani
{"title":"Copy-move forgery detection algorithm based on binarized statistical image features and principal component analysis","authors":"Azzedine Bensaad, Khaled Loukhaoukha, Said Sadoudi, Aissa Snani","doi":"10.1117/1.jei.33.4.043004","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043004","url":null,"abstract":"The most common form of image forgery is copy-move, which arises when an image region is duplicated and pasted onto another region of the same image. An effective algorithm for copy-move forgery detection based on binarized statistical image features (BSIF) and principal component analysis (PCA) is presented. Initially, the suspicious image is converted to grayscale and is subsequently partitioned into overlapping blocks. Feature vectors are extracted from these blocks using BSIF, followed by dimensionality reduction using PCA. Next, as a precursor to the matching step, the feature vectors are sorted lexicographically. Additionally, a morphological opening operation is applied to eliminate outliers. This algorithm offers not just forgery detection but also the ability to localize and identify duplicated regions. The proposed algorithm was assessed using three datasets: CoMoFoD, GRIP, and UNIPA. The experimental results show that this algorithm is fast and has high accuracy for forgery detection and localization. Moreover, it has high robustness under various postprocessing operations, such as brightness, contrast adjustments, and blurring. Furthermore, the proposed algorithm outperforms some recent approaches in overall performance.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"11 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141549670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ResRetinaFace: an efficient face detection network based on RetinaFace and residual structure","authors":"Xuanyu Liu, Shuliang Zhang, Junjie Hu, Peiyu Mao","doi":"10.1117/1.jei.33.4.043012","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043012","url":null,"abstract":"The detection of multiple faces in unconstrained environment in deep learning suffers from insufficient detection accuracy and inefficiency; at the same time, the detection of blurred, occluded, and very small faces is even more unsatisfactory. The detection of blurred, occluded, and very small faces in multiple face detection in unconstrained environment is a hard problem in face detection nowadays. It is difficult to balance the detection accuracy and real-time efficiency in face detection with the improved RetinaFace chosen in this study. Therefore, in order to improve the efficiency of detecting blurred, occluded, and very small faces among multiple faces in unconstrained environments, we introduce deformable convolution, feature pyramid networks (FPN), and coordinate attention (CA) attention mechanism based on RetinaFace algorithm. Deformable convolution can be dynamically adjusted according to the shape and deformation of the recognized object and is no longer limited to a fixed-size square receptive field to improve the image feature extraction capability of the convolutional layer. FPN enhances the feature semantic information of the lower layers with a small increase in computational effort and improves the robustness of the detection algorithm to detect targets of different sizes. CA is a novel, lightweight, and efficient attention mechanism module for improving model performance, which can be easily integrated into mobile networks to improve accuracy with little additional computational overhead. The improved ResRetinaFace algorithm does not increase the computational overhead too much while improving the recognition accuracy, and it can better combine the characteristics of multiple postures and deformations of faces in complex scenes, adapt to the deformation state of faces’ postures, and provide more effective features for face detection, so as to pay better attention to the detection target and enhance the network characterization ability. Meanwhile, the improved algorithm combines the feature pyramid with the context module, which improves the detection effect in the case of blurred, occluded, and very small faces. The experimental outcomes demonstrate that, in contrast to the method before enhancement, the accuracy rates for easy, medium, and hard classification scenarios on the WIDER FACE dataset, utilizing the ResNet50 backbone network, are 94.83%, 93.28%, and 84.99%, respectively. Accompanied by a frames-per-second rate of 7.704, this meets the precision and real-time criteria for face measurement tasks. Validation on the WIDER FACE dataset further affirms that ResRetinaFace consistently achieves reliable face detection while maintaining high detection efficiency.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"48 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jia Zhang, Yanzhu Zhang, Fan Yang, Tingxue Li, Yuhai Li, He Zhao, Jixiong Pu
{"title":"Improving the deblurring method of D2Net network for infrared videos","authors":"Jia Zhang, Yanzhu Zhang, Fan Yang, Tingxue Li, Yuhai Li, He Zhao, Jixiong Pu","doi":"10.1117/1.jei.33.4.043013","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043013","url":null,"abstract":"When facing motion and complex environmental conditions, infrared videos captured by thermal imaging devices often suffer from blurring, leading to unclear or missing details and positional information about the targets. To improve this problem, this work proposes an improved deblurring method suitable for infrared videos based on a deep learning-based deblurring network originally designed for visible light images. This method is built upon the D2Net network by introducing a spatial and channel reconstruction convolution for feature redundancy, enhancing the network’s capability for image feature learning. In terms of the encoder-decoder module, a triple attention mechanism and fast Fourier transform are introduced to further improve the network’s deblurring performance. Through ablative experiments on infrared datasets, the results demonstrate a significant improvement in deblurring performance compared to the original D2Net. Specifically, the improved network achieved a 1.42 dB increase in peak signal-to-noise ratio and a 0.02 dB increase in structural similarity compared to the original network. In summary, this paper achieves promising results in infrared video deblurring tasks, demonstrating the effectiveness of the proposed method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"25 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chao Jiang, Minqing Zhang, Zongbao Jiang, Yongjun Kong, Fuqiang Di
{"title":"Progressive reversible data hiding in encrypted images based on polynomial secret sharing and Chinese remainder theorem","authors":"Chao Jiang, Minqing Zhang, Zongbao Jiang, Yongjun Kong, Fuqiang Di","doi":"10.1117/1.jei.33.4.043008","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043008","url":null,"abstract":"In the current distributed environment, reversible data hiding in encrypted images has the disadvantages of low security and nonprogressivity. To address this problem, a homomorphic embedding algorithm is proposed based on polynomial secret sharing (PSS) and Chinese remainder theorem. First, the image owner encrypts the carrier image in streaming encryption and sends it to the data hider. Then, the data hider utilizes PSS to split the carrier image into n shares. At the same time, extra secrets after SS are embedded into the carrier shares using homomorphism. After splitting by Chinese remainder theorem, every share of the embedded data is divided into some sub-shares and then distributed to the participants. The participants that satisfy the threshold condition provide part or all of the sub-shares according to the authority of the data extractor. If each participant provides all sub-shares, the secrets and carrier image can be reconstructed completely. If each participant provides part of the sub-shares, the secrets and carrier image can be reconstructed partly. The experimental results show that the proposed scheme has progressivity, high security, and a large embedding rate (ER). Meanwhile, the ER is not affected by the carrier image.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"40 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141568282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Highly compressed image encryption algorithm via fractal and semi-tensor product compressed sensing","authors":"Lin Fan, Meng Li","doi":"10.1117/1.jei.33.4.043026","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043026","url":null,"abstract":"Storage space and security concerns on multimedia images have emerged as a global issue in recent years. Image encryption algorithm via compressed sensing (CS) is an effective method for data security and reducing storage space. However, the existing CS-based image encryption still faces problems, such as weak resistance to attacks and extensive data storage. We design a high-compression image encryption algorithm that combines fractal and semi-tensor product compressed sensing. First, a measurement matrix required for CS is generated using fractal blocks combined with the semi-tensor product method, which enhances security while reducing the size of the measurement matrix. Then, the measurements obtained from the sampling are used to define the product features of their mean and standard deviation. Exclusion criteria are set, and fractal codes are obtained through matched searching. Finally, the fractal code undergoes scrambling and diffusion, providing triple-layer protection and further improving the security of the secret image. In comparison to conventional methods, our proposed method has greatly improved the compression efficiency through compressed sampling and has the advantages of better concealment and enhanced robustness. Experiments show that we substantiate the effectiveness and superior performance of our method, all while upholding image quality and security.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"122 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141771014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}