Journal of Electronic Imaging最新文献

筛选
英文 中文
Receptive field enhancement and attention feature fusion network for underwater object detection 用于水下物体探测的感知场增强和注意力特征融合网络
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033007
Huipu Xu, Zegang He, Shuo Chen
{"title":"Receptive field enhancement and attention feature fusion network for underwater object detection","authors":"Huipu Xu, Zegang He, Shuo Chen","doi":"10.1117/1.jei.33.3.033007","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033007","url":null,"abstract":"Underwater environments have characteristics such as unclear imaging and complex backgrounds that lead to poor performance when applying mainstream object detection models directly. To improve the accuracy of underwater object detection, we propose an object detection model, RF-YOLO, which uses a receptive field enhancement (RFE) module in the backbone network to finish RFE and extract more effective features. We design the free-channel iterative attention feature fusion module to reconstruct the neck network and fuse different scales of feature layers to achieve cross-channel attention feature fusion. We use Scylla-intersection over union (SIoU) as the loss function of the model, which makes the model converge to the optimal direction of training through the angle cost, distance cost, shape cost, and IoU cost. The network parameters increase after adding modules, and the model is not easy to converge to the optimal state, so we propose a training method that effectively mines the performance of the detection network. Experiments show that the proposed RF-YOLO achieves a mean average precision of 87.56% and 86.39% on the URPC2019 and URPC2020 datasets, respectively. Through comparative experiments and ablation experiments, it was verified that the proposed network model has a higher detection accuracy in complex underwater environments.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"18 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140887026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Posture-guided part learning for fine-grained image categorization 用于细粒度图像分类的姿态引导部件学习
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033013
Wei Song, Dongmei Chen
{"title":"Posture-guided part learning for fine-grained image categorization","authors":"Wei Song, Dongmei Chen","doi":"10.1117/1.jei.33.3.033013","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033013","url":null,"abstract":"The challenge in fine-grained image classification tasks lies in distinguishing subtle differences among fine-grained images. Existing image classification methods often only explore information in isolated regions without considering the relationships among these parts, resulting in incomplete information and a tendency to focus on individual parts. Posture information is hidden among these parts, so it plays a crucial role in differentiating among similar categories. Therefore, we propose a posture-guided part learning framework capable of extracting hidden posture information among regions. In this framework, the dual-branch feature enhancement module (DBFEM) highlights discriminative information related to fine-grained objects by extracting attention information between the feature space and channels. The part selection module selects multiple discriminative parts based on the attention information from DBFEM. Building upon this, the posture feature fusion module extracts semantic features from discriminative parts and constructs posture features among different parts based on these semantic features. Finally, by fusing part semantic features with posture features, a comprehensive representation of fine-grained object features is obtained, aiding in differentiating among similar categories. Extensive evaluations on three benchmark datasets demonstrate the competitiveness of the proposed framework compared with state-of-the-art methods.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"23 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient and expressive high-resolution image synthesis via variational autoencoder-enriched transformers with sparse attention mechanisms 通过具有稀疏注意力机制的变异自动编码器富化变换器,实现高效且富有表现力的高分辨率图像合成
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033002
Bingyin Tang, Fan Feng
{"title":"Efficient and expressive high-resolution image synthesis via variational autoencoder-enriched transformers with sparse attention mechanisms","authors":"Bingyin Tang, Fan Feng","doi":"10.1117/1.jei.33.3.033002","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033002","url":null,"abstract":"We introduce a method for efficient and expressive high-resolution image synthesis, harnessing the power of variational autoencoders (VAEs) and transformers with sparse attention (SA) mechanisms. By utilizing VAEs, we can establish a context-rich vocabulary of image constituents, thereby capturing intricate image features in a superior manner compared with traditional techniques. Subsequently, we employ SA mechanisms within our transformer model, improving computational efficiency while dealing with long sequences inherent to high-resolution images. Extending beyond traditional conditional synthesis, our model successfully integrates both nonspatial and spatial information while also incorporating temporal dynamics, enabling sequential image synthesis. Through rigorous experiments, we demonstrate our method’s effectiveness in semantically guided synthesis of megapixel images. Our findings substantiate this method as a significant contribution to the field of high-resolution image synthesis.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"15 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140839767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Test-time adaptation via self-training with future information 通过未来信息的自我训练实现测试时间适应性
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033012
Xin Wen, Hao Shen, Zhongqiu Zhao
{"title":"Test-time adaptation via self-training with future information","authors":"Xin Wen, Hao Shen, Zhongqiu Zhao","doi":"10.1117/1.jei.33.3.033012","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033012","url":null,"abstract":"Test-time adaptation (TTA) aims to address potential differences in data distribution between the training and testing phases by modifying a pretrained model based on each specific test sample. This process is especially crucial for deep learning models, as they often encounter frequent changes in the testing environment. Currently, popular TTA methods rely primarily on pseudo-labels (PLs) as supervision signals and fine-tune the model through backpropagation. Consequently, the success of the model’s adaptation depends directly on the quality of the PLs. High-quality PLs can enhance the model’s performance, whereas low-quality ones may lead to poor adaptation results. Intuitively, if the PLs predicted by the model for a given sample remain consistent in both the current and future states, it suggests a higher confidence in that prediction. Using such consistent PLs as supervision signals can greatly benefit long-term adaptation. Nevertheless, this approach may induce overconfidence in the model’s predictions. To counter this, we introduce a regularization term that penalizes overly confident predictions. Our proposed method is highly versatile and can be seamlessly integrated with various TTA strategies, making it immensely practical. We investigate different TTA methods on three widely used datasets (CIFAR10C, CIFAR100C, and ImageNetC) with different scenarios and show that our method achieves competitive or state-of-the-art accuracies on all of them.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"37 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Motion trajectory reconstruction degree: a key frame selection criterion for surveillance video 运动轨迹重建度:监控视频的关键帧选择标准
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033009
Yunzuo Zhang, Yameng Liu, Jiayu Zhang, Shasha Zhang, Shuangshuang Wang, Yu Cheng
{"title":"Motion trajectory reconstruction degree: a key frame selection criterion for surveillance video","authors":"Yunzuo Zhang, Yameng Liu, Jiayu Zhang, Shasha Zhang, Shuangshuang Wang, Yu Cheng","doi":"10.1117/1.jei.33.3.033009","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033009","url":null,"abstract":"The primary focus of key frame extraction lies in extracting changes in the motion state from surveillance videos and considering them to be crucial content. However, existing key frame evaluation indicators cannot accurately assess whether the algorithm can capture them. Hence, key frame extraction methods are assessed from the viewpoint of target trajectory reconstruction. The motion trajectory reconstruction degree (MTRD), a key frame selection criterion based on maintaining target global and local motion information, is then put forth. Initially, this evaluation indicator extracts key frames using various key frame extraction methods and reconstructs the motion trajectory based on these key frames using a linear interpolation algorithm. Then, the original motion trajectories of the target are quantified and compared with the reconstructed set of motion trajectories. The more minor the MTRD discrepancy is, the better the trajectory overlap is, and the more accurate the key frames extracted with this method will be for the description of the video content. Finally, inspired by the novel MTRD criterion, we develop an MTRD-oriented key frame extraction method for the surveillance video. The outcomes of the simulations demonstrate that MTRD can more accurately capture the variations in the global and local motion states and is more compatible with the human visual perception.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"37 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HMNNet: research on exposure-based nighttime semantic segmentation HMNNet:基于曝光的夜间语义分割研究
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033015
Yang Yang, Changjiang Liu, Hao Li, Chuan Liu
{"title":"HMNNet: research on exposure-based nighttime semantic segmentation","authors":"Yang Yang, Changjiang Liu, Hao Li, Chuan Liu","doi":"10.1117/1.jei.33.3.033015","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033015","url":null,"abstract":"In recent years, various segmentation models have been developed successively. However, due to the limited availability of nighttime datasets and the complexity of nighttime scenes, there remains a scarcity of high-performance nighttime semantic segmentation models. Analysis of nighttime scenes has revealed that the primary challenges encountered are overexposure and underexposure. In view of this, our proposed Histogram Multi-scale Retinex with Color Restoration and No-Exposure Semantic Segmentation Network model is based on semantic segmentation of nighttime scenes and consists of three modules and a multi-head decoder. The three modules—Histogram, Multi-Scale Retinex with Color Restoration (MSRCR), and No Exposure (N-EX)—aim to enhance the robustness of image segmentation under different lighting conditions. The Histogram module prevents over-fitting to well-lit images, and the MSRCR module enhances images with insufficient lighting, improving object recognition and facilitating segmentation. The N-EX module uses a dark channel prior method to remove excess light covering the surface of an object. Extensive experiments show that the three modules are suitable for different network models and can be inserted and used at will. They significantly improve the model’s segmentation ability for nighttime images while having good generalization ability. When added to the multi-head decoder network, mean intersection over union increases by 6.2% on the nighttime dataset Rebecca and 1.5% on the daytime dataset CamVid.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"27 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep metric learning method for open-set iris recognition 用于开放集虹膜识别的深度度量学习方法
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033016
Guang Huo, Ruyuan Li, Jianlou Lou, Xiaolu Yu, Jiajun Wang, Xinlei He, Yue Wang
{"title":"Deep metric learning method for open-set iris recognition","authors":"Guang Huo, Ruyuan Li, Jianlou Lou, Xiaolu Yu, Jiajun Wang, Xinlei He, Yue Wang","doi":"10.1117/1.jei.33.3.033016","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033016","url":null,"abstract":"The existing iris recognition methods offer excellent recognition performance for known classes, but they do not perform well when faced with unknown classes. The process of identifying unknown classes is referred to as open-set recognition. To improve the robustness of iris recognition system, this work integrates a hash center to construct a deep metric learning method for open-set iris recognition, called central similarity based deep hash. It first maps each iris category into defined hash centers using a generation hash center algorithm. Then, OiNet is trained to each iris texture to cluster around the corresponding hash center. For testing, cosine similarity is calculated for each pair of iris textures to estimate their similarity. Based on experiments conducted on public datasets, along with evaluations of performance within the dataset and across different datasets, our method demonstrates substantial performance advantages compared with other algorithms for open-set iris recognition.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"127 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140932889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KT-NeRF: multi-view anti-motion blur neural radiance fields KT-NeRF:多视角抗运动模糊神经辐射场
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-05-01 DOI: 10.1117/1.jei.33.3.033006
Yining Wang, Jinyi Zhang, Yuxi Jiang
{"title":"KT-NeRF: multi-view anti-motion blur neural radiance fields","authors":"Yining Wang, Jinyi Zhang, Yuxi Jiang","doi":"10.1117/1.jei.33.3.033006","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.033006","url":null,"abstract":"In the field of three-dimensional (3D) reconstruction, neural radiation fields (NeRF) can implicitly represent high-quality 3D scenes. However, traditional neural radiation fields place very high demands on the quality of the input images. When motion blurred images are input, the requirement of NeRF for multi-view consistency cannot be met, which results in a significant degradation in the quality of the 3D reconstruction. To address this problem, we propose KT-NeRF that extends NeRF to motion blur scenes. Based on the principle of motion blur, the method is derived from two-dimensional (2D) motion blurred images to 3D space. Then, Gaussian process regression model is introduced to estimate the motion trajectory of the camera for each motion blurred image, with the aim of learning accurate camera poses at key time stamps during the exposure time. The camera poses at the key time stamps are used as inputs to the NeRF in order to allow the NeRF to learn the blur information embedded in the images. Finally, the parameters of the Gaussian process regression model and the NeRF are jointly optimized to achieve multi-view anti-motion blur. The experiment shows that KT-NeRF achieved a peak signal-to-noise ratio of 29.4 and a structural similarity index of 0.85, an increase of 3.5% and 2.4%, respectively, over existing advanced methods. The learned perceptual image patch similarity was also reduced by 7.1% to 0.13.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"105 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140839785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three-dimensional shape estimation of wires from three-dimensional X-ray computed tomography images of electrical cables 从电缆的三维 X 射线计算机断层扫描图像估算导线的三维形状
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-04-01 DOI: 10.1117/1.jei.33.3.031209
Shiori Ueda, Kanon Sato, Hideo Saito, Yutaka Hoshina
{"title":"Three-dimensional shape estimation of wires from three-dimensional X-ray computed tomography images of electrical cables","authors":"Shiori Ueda, Kanon Sato, Hideo Saito, Yutaka Hoshina","doi":"10.1117/1.jei.33.3.031209","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.031209","url":null,"abstract":"Electrical cables consist of numerous wires, the three-dimensional (3D) shape of which significantly impacts the cables’ overall properties, such as bending stiffness. Although X-ray computed tomography (CT) provides a non-destructive method to assess these properties, accurately determining the 3D shape of individual wires from CT images is challenging due to the large number of wires, low image resolution, and indistinguishable appearance of the wires. Previous research lacked quantitative evaluation for wire tracking, and its overall accuracy heavily relied on the accuracy of wire detection. In this study, we present a long short-term memory-based approach for wire tracking that improves robustness against detection errors. The proposed method predicts wire positions in subsequent frames based on previous frames. We evaluate the performance of the proposed method using both actual annotated cables and artificially noised annotations. Our method exhibits greater tracking accuracy and robustness to detection errors compared with the previous method.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"17 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140602006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flexible machine/deep learning microservice architecture for industrial vision-based quality control on a low-cost device 灵活的机器/深度学习微服务架构,在低成本设备上实现基于工业视觉的质量控制
IF 1.1 4区 计算机科学
Journal of Electronic Imaging Pub Date : 2024-03-01 DOI: 10.1117/1.jei.33.3.031208
Stefano Toigo, Brendon Kasi, Daniele Fornasier, Angelo Cenedese
{"title":"Flexible machine/deep learning microservice architecture for industrial vision-based quality control on a low-cost device","authors":"Stefano Toigo, Brendon Kasi, Daniele Fornasier, Angelo Cenedese","doi":"10.1117/1.jei.33.3.031208","DOIUrl":"https://doi.org/10.1117/1.jei.33.3.031208","url":null,"abstract":"This paper aims to delineate a comprehensive method that integrates machine vision and deep learning for quality control within an industrial setting. The proposed innovative approach leverages a microservice architecture that ensures adaptability and flexibility to different scenarios while focusing on the employment of affordable, compact hardware, and it achieves exceptionally high accuracy in performing the quality control task and keeping a minimal computation time. Consequently, the developed system operates entirely on a portable smart camera, eliminating the need for additional sensors such as photocells and external computation, which simplifies the setup and commissioning phases and reduces the overall impact on the production line. By leveraging the integration of the embedded system with the machinery, this approach offers real-time monitoring and analysis capabilities, facilitating the swift detection of defects and deviations from desired standards. Moreover, the low-cost nature of the solution makes it accessible to a wider range of manufacturing enterprises, democratizing quality processes in Industry 5.0. The system was successfully implemented and is fully operational in a real industrial environment, and the experimental results obtained from this implementation are presented in this work.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"88 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140076147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信