Signal Processing-Image Communication最新文献_第8页

Camera calibration using property of asymptotes with application to sports scenes 摄像机的渐近线标定及其在运动场景中的应用

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-04-12 DOI: 10.1016/j.image.2025.117331

Fengli Yang, Xuechun Wang, Yue Zhao

引用次数: 0

Hidden dangerous object detection for terahertz body security check images based on adaptive multi-scale decomposition convolution 基于自适应多尺度分解卷积的太赫兹人体安检图像危险目标检测

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-04-10 DOI: 10.1016/j.image.2025.117323

Zijie Guo , Heng Wu , Shaojuan Luo , Genping Zhao , Chunhua He , Tao Wang

{"title":"Hidden dangerous object detection for terahertz body security check images based on adaptive multi-scale decomposition convolution","authors":"Zijie Guo , Heng Wu , Shaojuan Luo , Genping Zhao , Chunhua He , Tao Wang","doi":"10.1016/j.image.2025.117323","DOIUrl":"10.1016/j.image.2025.117323","url":null,"abstract":"<div><div>Recently, detecting hidden dangerous objects with the terahertz technique has attracted extensive attention. Many convolutional neural network-based object detection methods can achieve excellent results in common object detection. However, the existing object detection methods generally have low detection accuracy and large model parameter issues for hidden dangerous objects in terahertz body security check images due to the blurring and poor quality of terahertz images and ignoring the global context information. To address these issues, we propose an enhanced You Only Look Once network (YOLO-AMDC), which is integrated with an adaptive multi-scale large-kernel decomposition convolution (AMDC) module. Specifically, we design an AMDC module to enhance the feature expression ability of the YOLO framework. Moreover, we develop the Bi-Level Routing Attention (BRA) mechanism and a simple parameter-free attention module (SimAM) to integrate and utilize contextual information to improve the performance of dangerous object detection. Additionally, we adopt a model pruning approach to reduce the number of model parameters. The experimental results show that YOLO-AMDC outperforms other state-of-the-art methods. Compared with YOLOv8s, YOLO-AMDC reduces the parameters by 3.9 M and improves mAP@50 by 5 %. The detection performance is still competitive when the number of parameters is significantly reduced by model pruning.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117323"},"PeriodicalIF":3.4,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143834656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ORB-SLAM3 and dense mapping algorithm based on improved feature matching 基于改进特征匹配的ORB-SLAM3和密集映射算法

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-04-10 DOI: 10.1016/j.image.2025.117322

Delin Zhang , Guangxiang Yang , Guangling Yu , Baofeng Yang , Xiaoheng Wang

{"title":"ORB-SLAM3 and dense mapping algorithm based on improved feature matching","authors":"Delin Zhang , Guangxiang Yang , Guangling Yu , Baofeng Yang , Xiaoheng Wang","doi":"10.1016/j.image.2025.117322","DOIUrl":"10.1016/j.image.2025.117322","url":null,"abstract":"<div><div>ORB-SLAM3 is currently the mainstream visual SLAM system, which uses feature matching based on ORB keypoints. However, ORB-SLAM3 faces two main issues: Firstly, feature matching is time-consuming, and the insufficient number of feature point matches results in lower algorithmic localization accuracy. Secondly, it lacks the capability to construct dense point cloud maps, therefore limiting its applicability in high-demand scenarios such as path planning. To address these issues, this paper proposes an ORB-SLAM3 and dense mapping algorithm based on improved feature matching. In the feature matching process of ORB-SLAM3, motion smoothness constraints are introduced and the image is gridded. The feature points that are at the edge of the grid are divided into multiple adjacent grids to solve the problems, which are unable to correctly partition the feature points to the corresponding grid and algorithm time consumption. This reduces matched time and increases the number of matched pairs, improving the positioning accuracy of ORB-SLAM3. Moreover, a dense mapping construction thread has been added to construct dense point cloud maps in real-time using keyframes and corresponding poses filtered from the feature matching stage. Finally, simulation experiments were conducted using the TUM dataset for validation. The results demonstrate that the improved algorithm reduced feature matching time by 75.71 % compared to ORB-SLAM3, increased the number of feature point matches by 88.69 %, and improved localization accuracy by 9.44 %. Furthermore, the validation confirmed that the improved algorithm is capable of constructing dense maps in real-time. In conclusion, the improved algorithm demonstrates excellent performance in terms of localization accuracy and dense mapping.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117322"},"PeriodicalIF":3.4,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143829426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Camouflaged instance segmentation based on multi-scale feature contour fusion swin transformer 基于多尺度特征轮廓融合的旋转变压器伪装实例分割

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-04-09 DOI: 10.1016/j.image.2025.117328

Yin-Fu Huang, Feng-Yen Jen

引用次数: 0

Self-adaptive and learnable detail enhancement network for efficient image super resolution 自适应可学习的图像细节增强网络

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-04-04 DOI: 10.1016/j.image.2025.117319

Wenbo Zhang, Lulu Pan, Ke Xu, Guo Li, Yanheng Lv, Lingxiao Li, Le Lei

{"title":"Self-adaptive and learnable detail enhancement network for efficient image super resolution","authors":"Wenbo Zhang, Lulu Pan, Ke Xu, Guo Li, Yanheng Lv, Lingxiao Li, Le Lei","doi":"10.1016/j.image.2025.117319","DOIUrl":"10.1016/j.image.2025.117319","url":null,"abstract":"<div><div>In recent years, single image super-resolution (SISR) methods based on deep learning have advanced significantly. However, their high computational complexity and memory demands hinder deployment on resource-constrained devices. Although numerous lightweight super-resolution (SR) methods have been proposed to address this issue, most fail to distinguish between flat and detailed regions in images, treating them uniformly. This lack of targeted design for detailed regions, which are critical to SR performance, results in redundancy and inefficiency in existing lightweight methods. To address these challenges, we propose a simple yet effective network Self-adaptive and Learnable Detail Enhancement Network (LDEN) that specifically focuses on the reconstruction of detailed regions. Firstly, we present two designs for the reconstruction of detailed regions: (1) we design the Learnable Detail Extraction Block (LDEB), which can pay special attention to detailed regions and employ a larger convolution kernel in LDEB to obtain a larger receptive field; (2) we design a lightweight attention mechanism called Detail-oriented Spatial Attention (DSA) to enhance the network's ability to reconstruct detailed regions. Secondly, we design a hierarchical refinement mechanism named Efficient Hierarchical Refinement Block (EHRB) which can reduce the inadequate information extraction and integration caused by rough single-layer refinement. Extensive experiments demonstrate that LDEN achieves state-of-the-art performance on all benchmark datasets. Notably, for 4 × magnification tasks, LDEN outperforms BSRN - the champion of the model complexity track of NTIRE 2022 Efficient SR Challenge - by achieving gains of 0.11 dB and 0.12 dB while reducing parameters by nearly 10 %.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117319"},"PeriodicalIF":3.4,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143817018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multimodal style aggregation network for art image classification 基于多模态风格聚合网络的艺术图像分类

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-03-30 DOI: 10.1016/j.image.2025.117309

Quan Wang, Guorui Feng

{"title":"Multimodal style aggregation network for art image classification","authors":"Quan Wang, Guorui Feng","doi":"10.1016/j.image.2025.117309","DOIUrl":"10.1016/j.image.2025.117309","url":null,"abstract":"<div><div>A large number of paintings are digitized, the automatic recognition and retrieval of artistic image styles become very meaningful. Because there is no standard definition and quantitative description of characteristics of artistic style, the representation of style is still a difficult problem. Recently, some work have used deep correlation features in neural style transfer to describe the texture characteristics of paintings and have achieved exciting results. Inspired by this, this paper proposes a multimodal style aggregation network that incorporates three modalities of texture, structure and color information of artistic images. Specifically, the group-wise Gram aggregation model is proposed to capture multi-level texture styles. The global average pooling (GAP) and histogram operation are employed to perform distillation of the high-level structural style and the low-level color style, respectively. Moreover, an improved deep correlation feature calculation method called learnable Gram (L-Gram) is proposed to enhance the ability to express style. Experiments show that our method outperforms several state-of-the-art methods in five style datasets.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117309"},"PeriodicalIF":3.4,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Few-shot image generation based on meta-learning and generative adversarial network 基于元学习和生成对抗网络的少镜头图像生成

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-03-28 DOI: 10.1016/j.image.2025.117307

Bowen Gu, Junhai Zhai

{"title":"Few-shot image generation based on meta-learning and generative adversarial network","authors":"Bowen Gu, Junhai Zhai","doi":"10.1016/j.image.2025.117307","DOIUrl":"10.1016/j.image.2025.117307","url":null,"abstract":"<div><div>Generative adversarial network (GAN) learns the latent distribution of samples through the adversarial training between discriminator and generator, then uses the learned probability distribution to generate realistic samples. Training a vanilla GAN requires a large number of samples and a significant amount of time. However, in practical applications, obtaining a large dataset and dedicating extensive time to model training can be very costly. Training a GAN with a small number of samples to generate high-quality images is a pressing research problem. Although this area has seen limited exploration, FAML (Fast Adaptive Meta-Learning) stands out as a notable approach. However, FAML has the following shortcomings: (1) The training time on complex datasets, such as VGGFaces and MiniImageNet, is excessively long. (2) It exhibits poor generalization performance and produces low-quality images across different datasets. (3) The generated samples lack diversity. To address the three shortcomings, we improved FAML in two key areas: model structure and loss function. The improved model effectively overcomes all three limitations of FAML. We conducted extensive experiments on four datasets to compare our model with the baseline FAML across seven evaluation metrics. The results demonstrate that our model is both more efficient and effective, particularly on the two complex datasets, VGGFaces and MiniImageNet. Our model outperforms FAML on six of the seven evaluation metrics, with only a slight underperformance on one metric. Our code is available at <span><span>https://github.com/BTGWS/FSML-GAN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117307"},"PeriodicalIF":3.4,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

OTPL: A novel measurement method of structural parallelism based on orientation transformation and geometric constraints OTPL：一种基于方位变换和几何约束的结构平行度测量方法

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-03-25 DOI: 10.1016/j.image.2025.117310

Weili Ding , Zhiyu Wang , Shuo Hu

{"title":"OTPL: A novel measurement method of structural parallelism based on orientation transformation and geometric constraints","authors":"Weili Ding , Zhiyu Wang , Shuo Hu","doi":"10.1016/j.image.2025.117310","DOIUrl":"10.1016/j.image.2025.117310","url":null,"abstract":"<div><div>Detecting parallel geometric structures from images is a significant step for computer vision tasks. In this paper, an algorithm called Orientation Transformation-based Parallelism Measurement (OTPL) is proposed in this paper to measure the parallelism of structures including both line structures and curve structures. The task is decomposed into measurements of parallel straight line and parallel curve structures due to the inherent geometric differences between them, where the parallelism between curve structures can be further transformed into a matching problem. For parallel straight lines, the angle constraints and the rate of overlapping projection are considered as the parallel relationship selection rules for the candidate lines. For the parallel curves, the approximate vertical growing (AVG) algorithm is proposed to accelerate the search of adjacent curves and each smooth curve is coded as a vector with different angle values. The matching pairs are extracted through cosine similarity transformation and convexity consistency. Finally, the parallel curves are extracted by a decision-making process. The proposed algorithm is evaluated in a comprehensive manner, encompassing both qualitative and quantitative approaches, with the objective of achieving a more robust assessment. The results demonstrate the algorithm's efficacy in identifying parallel structures in both synthetic and natural images.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117310"},"PeriodicalIF":3.4,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143760467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bidirectional interactive multi-scale network using Wave-Conv ViT for single image deraining 基于波变换ViT的单幅图像双向交互多尺度网络

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-03-23 DOI: 10.1016/j.image.2025.117311

Siyan Fang, Bin Liu

{"title":"Bidirectional interactive multi-scale network using Wave-Conv ViT for single image deraining","authors":"Siyan Fang, Bin Liu","doi":"10.1016/j.image.2025.117311","DOIUrl":"10.1016/j.image.2025.117311","url":null,"abstract":"<div><div>To address the limitations of high-frequency information capture by Vision Transformer (ViT) and the loss of fine details in existing image deraining methods, we introduce a Bidirectional Interactive Multi-Scale Network (BIMNet) that employs newly developed Wave-Conv ViT (WCV). The WCV utilizes a wavelet transform to enable self-attention in both low-frequency and high-frequency domains, significantly enhancing ViT's capacity for diverse frequency-domain feature modeling. Additionally, by incorporating convolutional operations, WCV enhances the extraction and integration of local features across various spatial windows. BIMNet injects rainy images into deep network layers, enabling bidirectional propagation with shallow layer features that enrich skip connections with detailed and complementary information, thus improving the fidelity of detail recovery. Moreover, we present the CORain1000 dataset, tailored for the dual challenges of image deraining and object detection, which offers more diversity in rain patterns, image sizes, and volumes than the commonly used COCO350 dataset. Extensive experiments demonstrate the superiority of BIMNet over advanced methods. The code and CORain1000 dataset are available at <span><span>https://github.com/fashyon/BIMNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"137 ","pages":"Article 117311"},"PeriodicalIF":3.4,"publicationDate":"2025-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hybrid Feature Extraction and Knowledge Distillation Based Deep Learning Model for Human Activity Recognition System 基于混合特征提取和知识蒸馏的人体活动识别深度学习模型

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-03-21 DOI: 10.1016/j.image.2025.117308

Hetal Shah , Mehfuza S. Holia

引用次数: 0