Journal of Electronic Imaging最新文献

Light field salient object detection network based on feature enhancement and mutual attention 基于特征增强和相互注意的光场突出物体检测网络

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053001

Xi Zhu, Huai Xia, Xucheng Wang, Zhenrong Zheng

{"title":"Light field salient object detection network based on feature enhancement and mutual attention","authors":"Xi Zhu, Huai Xia, Xucheng Wang, Zhenrong Zheng","doi":"10.1117/1.jei.33.5.053001","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053001","url":null,"abstract":"Light field salient object detection (SOD) is an essential research topic in computer vision, but robust saliency detection in complex scenes is still very challenging. We propose a new method for accurate and robust light field SOD via convolutional neural networks containing feature enhancement modules. First, the light field dataset is extended by geometric transformations such as stretching, cropping, flipping, and rotating. Next, two feature enhancement modules are designed to extract features from RGB images and depth maps, respectively. The obtained feature maps are fed into a two-stream network to train the light field SOD. We propose a mutual attention approach in this process, extracting and fusing features from RGB images and depth maps. Therefore, our network can generate an accurate saliency map from the input light field images after training. The obtained saliency map can provide reliable a priori information for tasks such as semantic segmentation, target recognition, and visual tracking. Experimental results show that the proposed method achieves excellent detection performance in public benchmark datasets and outperforms the state-of-the-art methods. We also verify the generalization and stability of the method in real-world experiments.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"8 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Small space target detection using megapixel resolution CeleX-V camera 利用百万像素分辨率的 CeleX-V 摄像机探测小空间目标

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053002

Yuanyuan Lv, Liang Zhou, Zhaohui Liu, Wenlong Qiao, Haiyang Zhang

引用次数: 0

Video anomaly detection based on frame memory bank and decoupled asymmetric convolutions 基于帧记忆库和解耦非对称卷积的视频异常检测

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053006

Min Zhao, Chuanxu Wang, Jiajiong Li, Zitai Jiang

{"title":"Video anomaly detection based on frame memory bank and decoupled asymmetric convolutions","authors":"Min Zhao, Chuanxu Wang, Jiajiong Li, Zitai Jiang","doi":"10.1117/1.jei.33.5.053006","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053006","url":null,"abstract":"Video anomaly detection (VAD) is essential for monitoring systems. The prediction-based methods identify anomalies by comparing differences between the predicted and real frames. We propose an unsupervised VAD method based on frame memory bank (FMB) and decoupled asymmetric convolution (DAConv), which addresses three problems encountered with auto-encoders (AE) in VAD: (1) how to mitigate the noise resulting from jittering between frames, which is ignored; (2) how to alleviate the insufficient utilization of temporal information by traditional two-dimensional (2D) convolution and the burden for more computing resources in three-dimensional (3D) convolution; and (3) how to make full use of normal data to improve the reliability of anomaly discrimination. Specifically, we initially design a separate network to calibrate video frames within the dataset. Second, we design DAConv to extract features from the video, addressing the absence of temporal dimension information in 2D convolutions and the high computational complexity of 3D convolutions. Concurrently, the interval-frame mechanism mitigates the problem of information redundancy caused by data reuse. Finally, we embed an FMB to store features of normal events, amplifying the contrast between normal and abnormal frames. We conduct extensive experiments on the UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets, achieving AUC values of 98.7%, 90.4%, and 74.8%, respectively, which fully demonstrates the rationality and effectiveness of the proposed method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"105 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Attention-injective scale aggregation network for crowd counting 用于人群计数的注意力注入式规模聚合网络

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053008

Haojie Zou, Yingchun Kuang, Jianqiang Luo, Mingwei Yao, Haoyu Zhou, Sha Yang

{"title":"Attention-injective scale aggregation network for crowd counting","authors":"Haojie Zou, Yingchun Kuang, Jianqiang Luo, Mingwei Yao, Haoyu Zhou, Sha Yang","doi":"10.1117/1.jei.33.5.053008","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053008","url":null,"abstract":"Crowd counting has gained widespread attention in the fields of public safety management, video surveillance, and emergency response. Currently, background interference and scale variation of the head are still intractable problems. We propose an attention-injective scale aggregation network (ASANet) to cope with the above problems. ASANet consists of three parts: shallow feature attention network (SFAN), multi-level feature aggregation (MLFA) module, and density map generation (DMG) network. SFAN effectively overcomes the noise impact of a cluttered background by cross-injecting the attention module in the truncated VGG16 structure. To fully utilize the multi-scale crowd information embedded in the feature layers at different positions, we densely connect the multi-layer feature maps in the MLFA module to solve the scale variation problem. In addition, to capture large-scale head information, the DMG network introduces successive dilated convolutional layers to further expand the receptive field of the model, thus improving the accuracy of crowd counting. We conduct extensive experiments on five public datasets (ShanghaiTech Part_A, ShanghaiTech Part_B, UCF_QNRF, UCF_CC_50, JHU-Crowd++), and the results show that ASANet outperforms most of the existing methods in terms of counting and at the same time demonstrates satisfactory superiority in dealing with background noise in different scenes.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"94 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Infrared and visible image fusion based on global context network 基于全球背景网络的红外和可见光图像融合

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053016

Yonghong Li, Yu Shi, Xingcheng Pu, Suqiang Zhang

{"title":"Infrared and visible image fusion based on global context network","authors":"Yonghong Li, Yu Shi, Xingcheng Pu, Suqiang Zhang","doi":"10.1117/1.jei.33.5.053016","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053016","url":null,"abstract":"Thermal radiation and texture data from two different sensor types are usually combined in the fusion of infrared and visible images for generating a single image. In recent years, convolutional neural network (CNN) based on deep learning has become the mainstream technology for many infrared and visible image fusion methods, which often extracts shallow features and ignores the role of long-range dependencies in the fusion task. However, due to its local perception characteristics, CNN can only obtain global contextual information by continuously stacking convolutional layers, which leads to low network efficiency and difficulty in optimization. To address this issue, we proposed a global context fusion network (GCFN) to model context using a global attention pool, which adopts a two-stage strategy. First, a GCFN-based autoencoder network is trained for extracting multi-scale local and global contextual features. To effectively incorporate the complementary information of the input image, a dual branch fusion network combining CNN and transformer is designed in the second step. Experimental results on a publicly available dataset demonstrate that the proposed method outperforms nine advanced methods in fusion performance on both subjective and objective metrics.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"23 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generative object separation in X-ray images X 射线图像中的生成物体分离

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053004

Xiaolong Zheng, Yu Zhou, Jia Yao, Liang Zheng

{"title":"Generative object separation in X-ray images","authors":"Xiaolong Zheng, Yu Zhou, Jia Yao, Liang Zheng","doi":"10.1117/1.jei.33.5.053004","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053004","url":null,"abstract":"X-ray imaging is essential for security inspection; nevertheless, the penetrability of X-rays can cause objects within a package to overlap in X-ray images, leading to reduced accuracy in manual inspection and increased difficulty in auxiliary inspection techniques. Existing methods mainly focus on object detection to enhance the detection ability of models for overlapping regions by augmenting image features, including color, texture, and semantic information. However, these approaches do not address the underlying issue of overlap. We propose a novel method for separating overlapping objects in X-ray images from the perspective of image inpainting. Specifically, the separation method involves using a vision transformer (ViT) to construct a generative adversarial network (GAN) model that requires a hand-created trimap as input. In addition, we present an end-to-end approach that integrates Mask Region-based Convolutional Neural Network with the separation network to achieve fully automated separation of overlapping objects. Given the lack of datasets appropriate for training separation networks, we created MaskXray, a collection of X-ray images that includes overlapping images, trimap, and individual object images. Our proposed generative separation network was tested in experiments and demonstrated its ability to accurately separate overlapping objects in X-ray images. These results demonstrate the efficacy of our approach and make significant contributions to the field of X-ray image analysis.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"4 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Vis-YOLO: a lightweight and efficient image detector for unmanned aerial vehicle small objects Vis-YOLO：轻便高效的无人飞行器小型物体图像探测器

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053003

Xiangyu Deng, Jiangyong Du

引用次数: 0

Double-level deep multi-view collaborative learning for image clustering 图像聚类的双层深度多视角协作学习

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053012

Liang Xiao, Wenzhe Liu

{"title":"Double-level deep multi-view collaborative learning for image clustering","authors":"Liang Xiao, Wenzhe Liu","doi":"10.1117/1.jei.33.5.053012","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053012","url":null,"abstract":"Multi-view clustering has garnered significant attention due to its ability to explore shared information from multiple views. Applications of multi-view clustering include image and video analysis, bioinformatics, and social network analysis, in which integrating diverse data sources enhances data understanding and insights. However, existing multi-view models suffer from the following limitations: (1) directly extracting latent representations from raw data using encoders is susceptible to interference from noise and other factors and (2) complementary information among different views is often overlooked, resulting in the loss of crucial unique information from each view. Therefore, we propose a distinctive double-level deep multi-view collaborative learning approach. Our method further processes the latent representations learned by the encoder through multiple layers of perceptrons to obtain richer semantic information. In addition, we introduce dual-path guidance at both the feature and label levels to facilitate the learning of complementary information across different views. Furthermore, we introduce pre-clustering methods to guide mutual learning among different views through pseudo-labels. Experimental results on four image datasets (Caltech-5V, STL10, Cifar10, Cifar100) demonstrate that our method achieves state-of-the-art clustering performance, evaluated using standard metrics, including accuracy, normalized mutual information, and purity. We compare our proposed method with existing clustering algorithms to validate its effectiveness.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"6 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

USDAP: universal source-free domain adaptation based on prompt learning USDAP：基于即时学习的通用无源域适应

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053015

Xun Shao, Mingwen Shao, Sijie Chen, Yuanyuan Liu

{"title":"USDAP: universal source-free domain adaptation based on prompt learning","authors":"Xun Shao, Mingwen Shao, Sijie Chen, Yuanyuan Liu","doi":"10.1117/1.jei.33.5.053015","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053015","url":null,"abstract":"Universal source-free domain adaptation (USFDA) aims to explore transferring domain-consistent knowledge in the presence of domain shift and category shift, without access to a source domain. Existing works mainly rely on prior domain-invariant knowledge provided by the source model, ignoring the significant discrepancy between the source and target domains. However, directly utilizing the source model will generate noisy pseudo-labels on the target domain, resulting in erroneous decision boundaries. To alleviate the aforementioned issue, we propose a two-stage USFDA approach based on prompt learning, named USDAP. Primarily, to reduce domain differences, during the prompt learning stage, we introduce a learnable prompt designed to align the target domain distribution with the source. Furthermore, for more discriminative decision boundaries, in the feature alignment stage, we propose an adaptive global-local clustering strategy. This strategy utilizes one-versus-all clustering globally to separate different categories and neighbor-to-neighbor clustering locally to prevent incorrect pseudo-label assignments at cluster boundaries. Based on the above two-stage method, target data are adapted to the classification network under the prompt’s guidance, forming more compact category clusters, thus achieving excellent migration performance for the model. We conduct experiments on various datasets with diverse category shift scenarios to illustrate the superiority of our USDAP.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"105 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Appearance flow based structure prior guided image inpainting 基于外观流的结构先导图像着色

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-09-01 DOI: 10.1117/1.jei.33.5.053011

Weirong Liu, Zhijun Li, Changhong Shi, Xiongfei Jia, Jie Liu

{"title":"Appearance flow based structure prior guided image inpainting","authors":"Weirong Liu, Zhijun Li, Changhong Shi, Xiongfei Jia, Jie Liu","doi":"10.1117/1.jei.33.5.053011","DOIUrl":"https://doi.org/10.1117/1.jei.33.5.053011","url":null,"abstract":"Image inpainting techniques based on deep learning have shown significant improvements by introducing structure priors, but still generate structure distortion or textures fuzzy for large missing areas. This is mainly because series networks have inherent disadvantages: employing unreasonable structural priors will inevitably lead to severe mistakes in the second stage of cascade inpainting framework. To address this issue, an appearance flow-based structure prior (AFSP) guided image inpainting is proposed. In the first stage, a structure generator regards edge-preserved smooth images as global structures of images and then appearance flow warps small-scale features in input and flows to corrupted regions. In the second stage, a texture generator using contextual attention is designed to yield image high-frequency details after obtaining reasonable structure priors. Compared with state-of-the-art approaches, the proposed AFSP achieved visually more realistic results. Compared on the Places2 dataset, the most challenging with 1.8 million high-resolution images of 365 complex scenes, shows that AFSP was 1.1731 dB higher than the average peak signal-to-noise ratio for EdgeConnect.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"7 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0