Journal of Electronic Imaging最新文献_第4页

Squeeze-and-excitation attention and bi-directional feature pyramid network for filter screens surface detection 用于滤网表面检测的挤压-激发注意和双向特征金字塔网络

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043044

Junpeng Xu, Xiangbo Zhu, Lei Shi, Jin Li, Ziman Guo

{"title":"Squeeze-and-excitation attention and bi-directional feature pyramid network for filter screens surface detection","authors":"Junpeng Xu, Xiangbo Zhu, Lei Shi, Jin Li, Ziman Guo","doi":"10.1117/1.jei.33.4.043044","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043044","url":null,"abstract":"Based on the enhanced YOLOv5, a deep learning defect detection technique is presented to deal with the problem of inadequate effectiveness in manually detecting problems on the surface of filter screens. In the last layer of the backbone network, the method combines the squeeze-and-excitation attention mechanism module, the method assigns weights to image locations based on the channel domain perspective to obtain more feature information. It also compares the results with a simple, parameter-free attention model (SimAM), which is an attention mechanism without the channel domain, and the results are higher than SimAM 0.7%. In addition, the neck network replaces the basic PANet structure with the bi-directional feature pyramid network module, which introduces multi-scale feature fusion. The experimental results show that the improved YOLOv5 algorithm has an average defect detection accuracy of 97.7% on the dataset, which is 11.3%, 12.8%, 2%, 7.8%, 5.1%, and 1.3% higher than YOLOv3, faster R-CNN, YOLOv5, SSD, YOLOv7, and YOLOv8, respectively. It can quickly and accurately identify various defects on the surface of the filter, which has an outstanding contribution to the filter manufacturing industry.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"12 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fusion 3D object tracking method based on region and point cloud registration 基于区域和点云注册的三维物体融合跟踪方法

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043048

Yixin Jin, Jiawei Zhang, Yinhua Liu, Wei Mo, Hua Chen

{"title":"Fusion 3D object tracking method based on region and point cloud registration","authors":"Yixin Jin, Jiawei Zhang, Yinhua Liu, Wei Mo, Hua Chen","doi":"10.1117/1.jei.33.4.043048","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043048","url":null,"abstract":"Tracking rigid objects in three-dimensional (3D) space and 6DoF pose estimating are essential tasks in the field of computer vision. In general, the region-based 3D tracking methods have emerged as the optimal solution for weakly textured objects tracking within intricate scenes in recent years. However, tracking robustness in situations such as partial occlusion and similarly colored backgrounds is relatively poor. To address this issue, an improved region-based tracking method is proposed for achieving accurate 3D object tracking in the presence of partial occlusion and similarly colored backgrounds. First, a regional cost function based on the correspondence line is adopted, and a step function is proposed to alleviate the misclassification of sampling points in scenes. Afterward, in order to reduce the influence of similarly colored background and partial occlusion on the tracking performance, a weight function that fuses color and distance information of the object contour is proposed. Finally, the transformation matrix of the inter-frame motion obtained by the above region-based tracking method is used to initialize the model point cloud, and an improved point cloud registration method is adopted to achieve accurate registration between the model point cloud and the object point cloud to further realize accurate object tracking. The experiments are conducted on the region-based object tracking (RBOT) dataset and the real scenes, respectively. The results demonstrate that the proposed method outperforms the state-of-the-art region-based 3D object tracking method. On the RBOT dataset, the average tracking success rate is improved by 0.5% across five image sequences. In addition, in real scenes with similarly colored backgrounds and partial occlusion, the average tracking accuracy is improved by 0.28 and 0.26 mm, respectively.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"8 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spatio-temporal enhancement method based on dense connection structure for compressed video 基于密集连接结构的压缩视频时空增强方法

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043054

Hongyao Li, Xiaohai He, Xiaodong Bi, Shuhua Xiong, Honggang Chen

{"title":"Spatio-temporal enhancement method based on dense connection structure for compressed video","authors":"Hongyao Li, Xiaohai He, Xiaodong Bi, Shuhua Xiong, Honggang Chen","doi":"10.1117/1.jei.33.4.043054","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043054","url":null,"abstract":"Under limited bandwidth conditions, video transmission often employs lossy compression to reduce the data volume, inevitably introducing compression noise. Quality enhancement of compressed videos can effectively recover the information loss incurred during the compression process. Currently, multi-frame quality enhancement of compressed videos has shown performance advantages compared to single-frame methods, as it utilizes the temporal correlation of videos. Methods based on deformable convolutions obtain spatio-temporal fusion features for reconstruction through multi-frame alignment. However, due to the limited utilization of deep information and sensitivity to alignment accuracy, these methods yield suboptimal results, especially in scenarios with scene changes and intense motion. To overcome these limitations, we propose a dense network-based quality enhancement method to obtain more accurate spatio-temporal fusion features. Specifically, the deep spatial features are first extracted from the to-be-enhanced frames using dense connections, then combined with the aligned features obtained from deformable convolution through the convolution and attention mechanism to make the network more attentive to useful branches in an adaptive way, and finally, the enhanced frames are obtained through the quality enhancement module of the dense connection structure. The experimental results show that when the quantization parameter is 37, the proposed method can improve the average peak signal-to-noise ratio by 0.99 dB in the lowdelay_P configuration.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"20 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fast and robust object region segmentation with self-organized lattice Boltzmann based active contour method 基于主动轮廓法的自组织晶格玻尔兹曼快速鲁棒物体区域分割法

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043050

Fatema A. Albalooshi, Vijayan K. Asari

{"title":"Fast and robust object region segmentation with self-organized lattice Boltzmann based active contour method","authors":"Fatema A. Albalooshi, Vijayan K. Asari","doi":"10.1117/1.jei.33.4.043050","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043050","url":null,"abstract":"We propose an approach leveraging the power of self-organizing maps (SOMs) in conjunction with a multiscale local image fitting (LIF) level-set function to enhance the capabilities of the region-based active contour model (ACM). In addition, we employ the lattice Boltzmann method (LBM) to ensure efficient convergence during the segmentation process. The SOM learns the underlying patterns and structures of both the background region and the object of interest region in an image, allowing for more accurate and robust segmentation results. Our multiscale LIF level-set approach influences image-specific fitting criteria into the energy functional, considering the features extracted by the SOM. Finally, the LBM is utilized to solve the level set equation and evolve the contour, allowing for a faster contour evolution. To evaluate the effectiveness of our approach, we performed our experiments on the challenging Pascal Visual Object Classes Challenge 2012 dataset. This dataset consists of images containing objects with diverse characteristics, such as illumination variations, shadows, occlusions, scale changes, and cluttered backgrounds. Our experimental results highlight the efficiency and robustness of our proposed method in achieving accurate segmentation. In terms of accuracy, our approach outperforms state-of-the-art learning-based ACMs, reaching a precision value of up to 93%. Moreover, our approach also demonstrates improvements in terms of computation time, leading to a reduction in computational time of 76% compared with the state-of-the-art methods. By integrating SOMs and the LBM, we enhance the efficiency of the segmentation process. This enables us to achieve accurate segmentation within reasonable time frames, making our method practical for real-world applications. Furthermore, we conducted experiments on medical imagery and thermal imagery, which yielded precise results.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"7 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Settlement detection from satellite imagery using fully convolutional network 利用全卷积网络从卫星图像中探测沉降点

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043056

Tayaba Anjum, Ahsan Ali, Muhammad Tahir Naseem

引用次数: 0

Coded target recognition algorithm for vision measurement 用于视觉测量的编码目标识别算法

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043058

Peng Zhang, Qing Liu, Shengpeng Li, Fei Liu, Wenjing Liu

{"title":"Coded target recognition algorithm for vision measurement","authors":"Peng Zhang, Qing Liu, Shengpeng Li, Fei Liu, Wenjing Liu","doi":"10.1117/1.jei.33.4.043058","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043058","url":null,"abstract":"Circularly coded targets are widely used in 3D measurement, target tracking, augmented reality, and other fields as feature points to be measured. The traditional coded target recognition algorithm is easily affected by illumination changes and excessive shooting angles, and the recognition accuracy is significantly reduced. Therefore, a new coded target recognition algorithm is required to reduce the effects of illumination and angle on the recognition process. The influence of illumination on the recognition of coding targets was analyzed in depth, and the advantages and disadvantages of traditional algorithms are discussed. A new adaptive threshold image segmentation method was designed, which, in contrast to traditional algorithms, incorporates the feature information of coding targets in the determination of the image segmentation threshold. The experimental results show that this method significantly reduces the influence of illumination variations and cluttered backgrounds on image segmentation. Similarly, the influence of different angles on the recognition process of coding targets was studied. The coding target is decoded by radial sampling of the dense point network, which can effectively reduce the influence of angle on the recognition process and improve the recognition accuracy of coding targets and the robustness of the algorithm. In addition, further experiments verified that the proposed detection and recognition algorithm can better extract and identify with high positioning accuracy and decoding success rate. It can achieve accurate positioning even in complex environments and meet the needs of industrial measurements.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"11 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DeepLab-Rail: semantic segmentation network for railway scenes based on encoder-decoder structure DeepLab-Rail：基于编码器-解码器结构的铁路场景语义分割网络

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043038

Qingsong Zeng, Linxuan Zhang, Yuan Wang, Xiaolong Luo, Yannan Chen

{"title":"DeepLab-Rail: semantic segmentation network for railway scenes based on encoder-decoder structure","authors":"Qingsong Zeng, Linxuan Zhang, Yuan Wang, Xiaolong Luo, Yannan Chen","doi":"10.1117/1.jei.33.4.043038","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043038","url":null,"abstract":"Understanding the perimeter objects and environment changes in railway scenes is crucial for ensuring the safety of train operation. Semantic segmentation is the basis of intelligent perception and scene understanding. Railway scene categories are complex and effective features are challenging to extract. This work proposes a semantic segmentation network DeepLab-Rail based on classic yet effective encoder-decoder structure. It contains a lightweight feature extraction backbone embedded with channel attention (CA) mechanism to keep computational complexity low. To enrich the receptive fields of convolutional modules, we design a parallel and cascade convolution module called compound-atrous spatial pyramid pooling and a combination of dilated convolution ratio is selected through experiments to obtain multi-scale features. To fully use the shallow features and the high-level features, efficient CA mechanism is introduced and also the mixed loss function is designed for the problem of unbalanced label categories of the dataset. Finally, the experimental results on the RailSem19 railway dataset show that the mean intersection over union reaches 65.52% and the PA reaches 88.48%. The segmentation performance of railway confusing facilities, such as signal lights and catenary pillars, has been significantly improved and surpasses other advanced methods to our best knowledge.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"44 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141968924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Joint merging and pruning: adaptive selection of better token compression strategy 联合合并和剪枝：自适应选择更好的标记压缩策略

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043045

Wei Peng, Liancheng Zeng, Lizhuo Zhang, Yue Shen

{"title":"Joint merging and pruning: adaptive selection of better token compression strategy","authors":"Wei Peng, Liancheng Zeng, Lizhuo Zhang, Yue Shen","doi":"10.1117/1.jei.33.4.043045","DOIUrl":"https://doi.org/10.1117/1.jei.33.4.043045","url":null,"abstract":"Vision transformer (ViT) is widely used to handle artificial intelligence tasks, making significant advances in a variety of computer vision tasks. However, due to the secondary interaction between tokens, the ViT model is inefficient, which greatly limits the application of the ViT model in real scenarios. In recent years, people have noticed that not all tokens contribute equally to the final prediction of the model, so token compression methods have been proposed, which are mainly divided into token pruning and token merging. Yet, we believe that neither pruning only to reduce non-critical tokens nor merging to reduce similar tokens are optimal strategies for token compression. To overcome this challenge, this work proposes a token compression framework: joint merging and pruning (JMP), which adaptively selects a better token compression strategy based on the similarity between critical tokens and non-critical tokens in each sample. JMP effectively reduces computational complexity while maintaining model performance and does not require the introduction of additional trainable parameters, achieving a good balance between efficiency and performance. Taking DeiT-S as an example, JMP reduces floating point operations by 35% and increases throughput by more than 45% while only decreasing accuracy by 0.2% on ImageNet.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"405 1","pages":""},"PeriodicalIF":1.1,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142202449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Deep inner-knuckle-print recognition using lightweight Siamese network 利用轻量级连体网络深度识别关节内侧指纹

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043034

Hongxia Wang, Hongwu Yuan

引用次数: 0

Fine-tuned Siamese neural network–based multimodal vein biometric system with hybrid firefly–particle swarm optimization 基于暹罗神经网络的微调型多模态静脉生物识别系统与混合萤火虫-粒子群优化技术

IF 1.1 4区计算机科学

Journal of Electronic Imaging Pub Date : 2024-08-01 DOI: 10.1117/1.jei.33.4.043035

Gurunathan Velliangiri, Sudhakar Radhakrishnan

引用次数: 0