Journal of Electronic Imaging最新文献_第10页

Receptive field enhancement and attention feature fusion network for underwater object detection 用于水下物体探测的感知场增强和注意力特征融合网络

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033007

Huipu Xu, Zegang He, Shuo Chen

{"title":"Receptive field enhancement and attention feature fusion network for underwater object detection","authors":"Huipu Xu, Zegang He, Shuo Chen","doi":"10.1117/1.jei.33.3.033007","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033007","url":null,"abstract":"Underwater environments have characteristics such as unclear imaging and complex backgrounds that lead to poor performance when applying mainstream object detection models directly. To improve the accuracy of underwater object detection, we propose an object detection model, RF-YOLO, which uses a receptive field enhancement (RFE) module in the backbone network to finish RFE and extract more effective features. We design the free-channel iterative attention feature fusion module to reconstruct the neck network and fuse different scales of feature layers to achieve cross-channel attention feature fusion. We use Scylla-intersection over union (SIoU) as the loss function of the model, which makes the model converge to the optimal direction of training through the angle cost, distance cost, shape cost, and IoU cost. The network parameters increase after adding modules, and the model is not easy to converge to the optimal state, so we propose a training method that effectively mines the performance of the detection network. Experiments show that the proposed RF-YOLO achieves a mean average precision of 87.56% and 86.39% on the URPC2019 and URPC2020 datasets, respectively. Through comparative experiments and ablation experiments, it was verified that the proposed network model has a higher detection accuracy in complex underwater environments.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"18 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Posture-guided part learning for fine-grained image categorization 用于细粒度图像分类的姿态引导部件学习

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033013

Wei Song, Dongmei Chen

{"title":"Posture-guided part learning for fine-grained image categorization","authors":"Wei Song, Dongmei Chen","doi":"10.1117/1.jei.33.3.033013","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033013","url":null,"abstract":"The challenge in fine-grained image classification tasks lies in distinguishing subtle differences among fine-grained images. Existing image classification methods often only explore information in isolated regions without considering the relationships among these parts, resulting in incomplete information and a tendency to focus on individual parts. Posture information is hidden among these parts, so it plays a crucial role in differentiating among similar categories. Therefore, we propose a posture-guided part learning framework capable of extracting hidden posture information among regions. In this framework, the dual-branch feature enhancement module (DBFEM) highlights discriminative information related to fine-grained objects by extracting attention information between the feature space and channels. The part selection module selects multiple discriminative parts based on the attention information from DBFEM. Building upon this, the posture feature fusion module extracts semantic features from discriminative parts and constructs posture features among different parts based on these semantic features. Finally, by fusing part semantic features with posture features, a comprehensive representation of fine-grained object features is obtained, aiding in differentiating among similar categories. Extensive evaluations on three benchmark datasets demonstrate the competitiveness of the proposed framework compared with state-of-the-art methods.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"23 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient and expressive high-resolution image synthesis via variational autoencoder-enriched transformers with sparse attention mechanisms 通过具有稀疏注意力机制的变异自动编码器富化变换器，实现高效且富有表现力的高分辨率图像合成

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033002

Bingyin Tang, Fan Feng

引用次数: 0

Test-time adaptation via self-training with future information 通过未来信息的自我训练实现测试时间适应性

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033012

Xin Wen, Hao Shen, Zhongqiu Zhao

{"title":"Test-time adaptation via self-training with future information","authors":"Xin Wen, Hao Shen, Zhongqiu Zhao","doi":"10.1117/1.jei.33.3.033012","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033012","url":null,"abstract":"Test-time adaptation (TTA) aims to address potential differences in data distribution between the training and testing phases by modifying a pretrained model based on each specific test sample. This process is especially crucial for deep learning models, as they often encounter frequent changes in the testing environment. Currently, popular TTA methods rely primarily on pseudo-labels (PLs) as supervision signals and fine-tune the model through backpropagation. Consequently, the success of the model’s adaptation depends directly on the quality of the PLs. High-quality PLs can enhance the model’s performance, whereas low-quality ones may lead to poor adaptation results. Intuitively, if the PLs predicted by the model for a given sample remain consistent in both the current and future states, it suggests a higher confidence in that prediction. Using such consistent PLs as supervision signals can greatly benefit long-term adaptation. Nevertheless, this approach may induce overconfidence in the model’s predictions. To counter this, we introduce a regularization term that penalizes overly confident predictions. Our proposed method is highly versatile and can be seamlessly integrated with various TTA strategies, making it immensely practical. We investigate different TTA methods on three widely used datasets (CIFAR10C, CIFAR100C, and ImageNetC) with different scenarios and show that our method achieves competitive or state-of-the-art accuracies on all of them.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"37 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Motion trajectory reconstruction degree: a key frame selection criterion for surveillance video 运动轨迹重建度：监控视频的关键帧选择标准

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033009

Yunzuo Zhang, Yameng Liu, Jiayu Zhang, Shasha Zhang, Shuangshuang Wang, Yu Cheng

{"title":"Motion trajectory reconstruction degree: a key frame selection criterion for surveillance video","authors":"Yunzuo Zhang, Yameng Liu, Jiayu Zhang, Shasha Zhang, Shuangshuang Wang, Yu Cheng","doi":"10.1117/1.jei.33.3.033009","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033009","url":null,"abstract":"The primary focus of key frame extraction lies in extracting changes in the motion state from surveillance videos and considering them to be crucial content. However, existing key frame evaluation indicators cannot accurately assess whether the algorithm can capture them. Hence, key frame extraction methods are assessed from the viewpoint of target trajectory reconstruction. The motion trajectory reconstruction degree (MTRD), a key frame selection criterion based on maintaining target global and local motion information, is then put forth. Initially, this evaluation indicator extracts key frames using various key frame extraction methods and reconstructs the motion trajectory based on these key frames using a linear interpolation algorithm. Then, the original motion trajectories of the target are quantified and compared with the reconstructed set of motion trajectories. The more minor the MTRD discrepancy is, the better the trajectory overlap is, and the more accurate the key frames extracted with this method will be for the description of the video content. Finally, inspired by the novel MTRD criterion, we develop an MTRD-oriented key frame extraction method for the surveillance video. The outcomes of the simulations demonstrate that MTRD can more accurately capture the variations in the global and local motion states and is more compatible with the human visual perception.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"37 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HMNNet: research on exposure-based nighttime semantic segmentation HMNNet：基于曝光的夜间语义分割研究

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033015

Yang Yang, Changjiang Liu, Hao Li, Chuan Liu

{"title":"HMNNet: research on exposure-based nighttime semantic segmentation","authors":"Yang Yang, Changjiang Liu, Hao Li, Chuan Liu","doi":"10.1117/1.jei.33.3.033015","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033015","url":null,"abstract":"In recent years, various segmentation models have been developed successively. However, due to the limited availability of nighttime datasets and the complexity of nighttime scenes, there remains a scarcity of high-performance nighttime semantic segmentation models. Analysis of nighttime scenes has revealed that the primary challenges encountered are overexposure and underexposure. In view of this, our proposed Histogram Multi-scale Retinex with Color Restoration and No-Exposure Semantic Segmentation Network model is based on semantic segmentation of nighttime scenes and consists of three modules and a multi-head decoder. The three modules—Histogram, Multi-Scale Retinex with Color Restoration (MSRCR), and No Exposure (N-EX)—aim to enhance the robustness of image segmentation under different lighting conditions. The Histogram module prevents over-fitting to well-lit images, and the MSRCR module enhances images with insufficient lighting, improving object recognition and facilitating segmentation. The N-EX module uses a dark channel prior method to remove excess light covering the surface of an object. Extensive experiments show that the three modules are suitable for different network models and can be inserted and used at will. They significantly improve the model’s segmentation ability for nighttime images while having good generalization ability. When added to the multi-head decoder network, mean intersection over union increases by 6.2% on the nighttime dataset Rebecca and 1.5% on the daytime dataset CamVid.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"27 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep metric learning method for open-set iris recognition 用于开放集虹膜识别的深度度量学习方法

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033016

Guang Huo, Ruyuan Li, Jianlou Lou, Xiaolu Yu, Jiajun Wang, Xinlei He, Yue Wang

引用次数: 0

KT-NeRF: multi-view anti-motion blur neural radiance fields KT-NeRF：多视角抗运动模糊神经辐射场

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033006

Yining Wang, Jinyi Zhang, Yuxi Jiang

{"title":"KT-NeRF: multi-view anti-motion blur neural radiance fields","authors":"Yining Wang, Jinyi Zhang, Yuxi Jiang","doi":"10.1117/1.jei.33.3.033006","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033006","url":null,"abstract":"In the field of three-dimensional (3D) reconstruction, neural radiation fields (NeRF) can implicitly represent high-quality 3D scenes. However, traditional neural radiation fields place very high demands on the quality of the input images. When motion blurred images are input, the requirement of NeRF for multi-view consistency cannot be met, which results in a significant degradation in the quality of the 3D reconstruction. To address this problem, we propose KT-NeRF that extends NeRF to motion blur scenes. Based on the principle of motion blur, the method is derived from two-dimensional (2D) motion blurred images to 3D space. Then, Gaussian process regression model is introduced to estimate the motion trajectory of the camera for each motion blurred image, with the aim of learning accurate camera poses at key time stamps during the exposure time. The camera poses at the key time stamps are used as inputs to the NeRF in order to allow the NeRF to learn the blur information embedded in the images. Finally, the parameters of the Gaussian process regression model and the NeRF are jointly optimized to achieve multi-view anti-motion blur. The experiment shows that KT-NeRF achieved a peak signal-to-noise ratio of 29.4 and a structural similarity index of 0.85, an increase of 3.5% and 2.4%, respectively, over existing advanced methods. The learned perceptual image patch similarity was also reduced by 7.1% to 0.13.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"105 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140839785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Three-dimensional shape estimation of wires from three-dimensional X-ray computed tomography images of electrical cables 从电缆的三维 X 射线计算机断层扫描图像估算导线的三维形状

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-04-01 DOI: 10.1117/1.jei.33.3.031209

Shiori Ueda, Kanon Sato, Hideo Saito, Yutaka Hoshina

引用次数: 0

Flexible machine/deep learning microservice architecture for industrial vision-based quality control on a low-cost device 灵活的机器/深度学习微服务架构，在低成本设备上实现基于工业视觉的质量控制

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-03-01 DOI: 10.1117/1.jei.33.3.031208

Stefano Toigo, Brendon Kasi, Daniele Fornasier, Angelo Cenedese

{"title":"Flexible machine/deep learning microservice architecture for industrial vision-based quality control on a low-cost device","authors":"Stefano Toigo, Brendon Kasi, Daniele Fornasier, Angelo Cenedese","doi":"10.1117/1.jei.33.3.031208","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.031208","url":null,"abstract":"This paper aims to delineate a comprehensive method that integrates machine vision and deep learning for quality control within an industrial setting. The proposed innovative approach leverages a microservice architecture that ensures adaptability and flexibility to different scenarios while focusing on the employment of affordable, compact hardware, and it achieves exceptionally high accuracy in performing the quality control task and keeping a minimal computation time. Consequently, the developed system operates entirely on a portable smart camera, eliminating the need for additional sensors such as photocells and external computation, which simplifies the setup and commissioning phases and reduces the overall impact on the production line. By leveraging the integration of the embedded system with the machinery, this approach offers real-time monitoring and analysis capabilities, facilitating the swift detection of defects and deviations from desired standards. Moreover, the low-cost nature of the solution makes it accessible to a wider range of manufacturing enterprises, democratizing quality processes in Industry 5.0. The system was successfully implemented and is fully operational in a real industrial environment, and the experimental results obtained from this implementation are presented in this work.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"88 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140076147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0