{"title":"One-bit distributed compressed sensing with partial gaussian circulant matrices","authors":"Yuke Leng, Jingyao Hou, Xinling Liu, Jianjun Wang","doi":"10.1007/s10489-025-06599-8","DOIUrl":"10.1007/s10489-025-06599-8","url":null,"abstract":"<div><p>One-bit distributed compressed sensing has been widely used in multi-node networks and many other fields. Conventional approaches often employ random Gaussian measurement matrices, but these unstructured matrices demand significant memory and computational resources. To address this limitation, we propose the use of structured partial Gaussian circulant matrices. This kind of matrix facilitates faster matrix operations and permits low storage, making it more practical. To the best of our knowledge, we are the first to theoretically prove that these matrices satisfy the <span>(ell _1/ell _{2,1})</span>-RIP in one-bit distributed compressed sensing. We prove that the required number of measurements under partial Gaussian circulant measurements enjoys the same order with that of Gaussian, which, however, is more computational efficient. Furthermore, numerical experiments confirm that partial Gaussian circulant matrices and random Gaussian matrices exhibit comparable reconstruction performance. Additionally, partial Gaussian circulant matrices spend less recovery time, offering higher computational efficiency.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Cheng, Tianshu Xie, Xiaomin Wang, Meiyi Yang, Jiali Deng, Minghui Liu, Ming Liu
{"title":"Selective output smoothing regularization: Regularize neural networks by softening output distributions","authors":"Xuan Cheng, Tianshu Xie, Xiaomin Wang, Meiyi Yang, Jiali Deng, Minghui Liu, Ming Liu","doi":"10.1007/s10489-025-06539-6","DOIUrl":"10.1007/s10489-025-06539-6","url":null,"abstract":"<div><p>Convolutional neural networks (CNNs) often exhibit overfitting due to overconfident predictions, which limits the effective utilization of training samples. Inspired by the diverse effects of training from different samples, we propose selective output smoothing regularization(SOSR) that improves model performance by encouraging the generation of equal logits on incorrect classes when handling samples that are correctly and overconfidently classified. This plug-and-play approach integrates seamlessly into diverse CNN architectures without altering their core design. SOSR demonstrates consistent improvements on various benchmarks, such as a 1.1% accuracy gain on ImageNet with ResNet-50 (77.30%). It synergizes effectively with several widely used techniques, such as CutMix and label smoothing, achieving incremental benefits, highlighting its potential as a foundational tool in advancing deep learning applications. Overall, SOSR effectively alleviates underutilization of high-confidence samples, enhances the generalizability of CNNs, and emerges as a robust tool for improving deep learning applications.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06539-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual-decoupling inter-correction multitemporal framework for high-, medium-, and low-resolution optical remote sensing image reconstruction","authors":"Weiling Liu, Changqing Huang, Yonghua Jiang, Jingyin Wang, Guo Zhang, Huaibo Song, Xinghua Li","doi":"10.1007/s10489-025-06522-1","DOIUrl":"10.1007/s10489-025-06522-1","url":null,"abstract":"<div><p>Reconstructing missing information due to cloud occlusion is an effective means of enhancing the utilization of low-, medium-, and high-resolution optical remote sensing images. However, singletemporal-based methods have limitations regarding the demand for cloud-free reference data and the applicability of specific datadriven models to real-world scenarios. It is more unable to realize mutitemporal reconstruction. To address this, we propose the Dual-Decoupling Inter-correction Multitemporal Reconstruction network (DDIM-RecNet), a unified framework designed for single- and multitemporal cloud occlusion reconstruction of low-, medium-, and high-resolution images. DDIM-RecNet innovatively decouples remote sensing images into ground object and imaging environment components using dedicated inter-correction modules, coupled with targeted loss functions. Additionally, an imaging environment enhancement module ensures spatial consistency between reconstructed and original regions. Compared with classical models, such as U-Net, RFR-Net, STGAN, PSTCR, BSN, GLDF-RecNet, and IDF-CR, DDIM-RecNet achieved excellent visual reconstruction results and the best quantitative evaluation indicators under Gaofen-1 (2 m), Sentinel-2 (10 m), Landsat-8 (30 m) single/multitemporal images. Taking Gaofen-1 (2 m) as an example, compared with the suboptimal model, the clarity of the DDIM-RecNet model in the three bands was improved by 0.44, 0.70, and 0.85 respectively under singletemporal reconstruction; the clarity of DDIM-RecNet was improved by 0.55, 0.43, and 0.35 respectively under mutitemporal cloud occlusion.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06522-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep RGB-guided generative network for unsupervised hyperspectral image super-resolution","authors":"Xian-Hua Han, Zhe Liu","doi":"10.1007/s10489-025-06595-y","DOIUrl":"10.1007/s10489-025-06595-y","url":null,"abstract":"<div><p>Hyperspectral image (HSI) super-resolution (SR) aims to mathematically generate a high spatial resolution hyperspectral (HR-HS) image by merging the degraded observations: a low spatial resolution hyperspectral (LR-HS) image and a high spatial resolution multispectral or RGB (HR-MS/RGB) image. Currently, deep convolution network-based paradigms have been extensively explored to automatically learn the inherent priors of the latent HR-HS images and have shown remarkable performance progress. However, existing methods usually are realized in a fully supervised manner and have to previously prepare a large external dataset containing the degraded observations: the LR-HS/HR-RGB image and its corresponding HR-HS ground truth, which are difficult to collect, especially in the HSI SR scenario. To this end, this study proposes a novel unsupervised HSI SR method by using only the observed degradation data without any other external sample. Specifically, we use a deep RGB-guided generative network to generate the target HR-HS image with an encoder-decoder-based network. Since the observed HR-RGB image has a more detailed spatial structure, which may have better compatibility with the 2D convolution operation, we take the observed HR-RGB image as the network input to serve as the conditional guidance, while using the degraded observations to construct the loss function to guide the network learning. Experimental results on several benchmark HS image datasets demonstrate that the proposed unsupervised method achieves superior performance over various SoTA paradigms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quadruped robot locomotion via soft actor-critic with muti-head critic and dynamic policy gradient","authors":"Yanan Fan, Zhongcai Pei, Hongbing Shi, Meng Li, Tianyuan Guo, Zhiyong Tang","doi":"10.1007/s10489-025-06584-1","DOIUrl":"10.1007/s10489-025-06584-1","url":null,"abstract":"<div><p>Quadruped robots’ nonlinear complexity makes traditional modeling challenging, while deep reinforcement learning (DRL) learns effectively through direct environmental interaction without explicit kinematic and dynamic models, becoming an efficient approach for quadruped locomotion across diverse terrains. Conventional reinforcement learning methods typically combine multiple reward criteria into a single scalar function, limiting information representation and complicating the balance between multiple control objectives. We propose a novel multi-head critic and dynamic policy gradient SAC (MHD-SAC) algorithm, innovatively combining a multi-head critic architecture that independently evaluates distinct reward components and a dynamic policy gradient method that adaptively adjusts weights based on current performance. Through simulations on both flat and uneven terrains comparing three approaches (Soft Actor-Critic (SAC), multi-head critic SAC (MH-SAC), and MHD-SAC), we demonstrate that the MHD-SAC algorithm achieves significantly faster learning convergence and higher cumulative rewards than conventional methods. Performance analysis across different reward components reveals MHD-SAC’s superior ability to balance multiple objectives. The results validate that our approach effectively addresses the challenges of multi-objective optimization in quadruped locomotion control, providing a promising foundation for developing more versatile and robust legged robots capable of traversing complex environments.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143908760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Real-numbered singular matrix transformation for non-invertible and cancelable biometric templates","authors":"Onkar Singh, Ajay Jaiswal, Naveen Kumar","doi":"10.1007/s10489-025-06534-x","DOIUrl":"10.1007/s10489-025-06534-x","url":null,"abstract":"<div><p>Cancelable biometrics mitigate privacy and security concerns in biometric-based user authentication by transforming biometric data into non-invertible templates. However, achieving non-invertibility often comes at the cost of reduced discriminability. This paper presents RP-SmXOR, a novel approach for generating cancelable biometric templates, leveraging person-specific real-numbered singular matrices for non-invertible transformation. By combining random permutation, Bitwise-XOR, and the Hadamard product, RP-SmXOR retains and enhances the discriminative information in the templates while addressing the privacy and security concerns associated with traditional biometric authentication. The proposed method was extensively evaluated on seven diverse biometric databases, demonstrating superior performance compared to state-of-the-art random permutation-based techniques. A thorough privacy and security analysis, including brute-force, false acceptance, Attack via Record Multiplicity (ARM), and inverse attacks, along with similarity metrics, confirms the non-invertibility, security, and robustness of the generated templates. Thus, RP-SmXOR adheres to the key principles of cancelable biometrics while significantly improving recognition accuracy and establishing it as a promising solution for secure biometric authentication.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive unsupervised deep learning denoising for medical imaging with unbiased estimation and Hessian-based regularization","authors":"Cheng Zhang, Kin Sam Yen","doi":"10.1007/s10489-025-06591-2","DOIUrl":"10.1007/s10489-025-06591-2","url":null,"abstract":"<div><p>This paper introduces an adaptive, unsupervised deep learning model for denoising Gaussian noise in Magnetic Resonance Imaging (MRI) images. The model is combined with Deep Image Prior (DIP) and Stein's Unbiased Risk Estimate (SURE) and incorporates a regularization term based on the Frobenius norm of the Hessian matrix. Leveraging the SURE criterion, the observed noisy image is used as the network input, significantly accelerating convergence speed and achieving more than a tenfold improvement over DIP. The real-time, adaptive adjustment of regularization intensity, driven by SURE, ensures robust performance across varying noise levels while effectively balancing the preservation of fine image details with noise elimination. The Hessian-based regularization captures second-order variations, promoting smoothness while preserving critical structural details. Experimental results demonstrate the model's superiority, with an average 8.7% increase in PSNR and a 10.1% increase in SSIM compared to DIP achieved. Furthermore, by the 25<i>th</i> iteration, the SSIM value of the proposed method had already surpassed the peak value reached by DIP at the 700<i>th</i> iteration and by DIP variants at the 2000<i>th</i> iteration. These advantages, coupled with the adaptive regularization strength adjustment, demonstrate the model's potential to enhance diagnostic accuracy and efficiency in medical applications.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143904686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting Zhang, Tianyang You, Zhaoying Liu, Sadaqat Ur Rehman, Yanan Shi, Amr Munshi
{"title":"Small sample pipeline DR defect detection based on smooth variational autoencoder and enhanced detection head faster RCNN","authors":"Ting Zhang, Tianyang You, Zhaoying Liu, Sadaqat Ur Rehman, Yanan Shi, Amr Munshi","doi":"10.1007/s10489-025-06590-3","DOIUrl":"10.1007/s10489-025-06590-3","url":null,"abstract":"<div><p>The safe operation of gas pipelines is crucial for the safety of residents’ lives and property. However, accurately detecting defects within these gas pipelines is a challenging task. To improve the accuracy of defect detection in pipeline DR images with small sample sizes, we propose an enhanced Faster RCNN model based on a Smooth Variational Autoencoder and Enhanced Detection Head (S-EDH-Faster RCNN). This model leverages a smooth variational autoencoder to reconstruct features and enhances classification scores through an improved detection head, thereby boosting overall detection accuracy. In detail, to address the issue of scarce training samples for new categories, we design a smooth variational autoencoder to reconstruct features that better fit the distribution of training data. Furthermore, to refine classification precision, we present an enhanced detection head that incorporates a convolutional block attention-based center point classification calibration module, which strengthens classification-related portions of the RoI features and adjusts classification scores accordingly. Finally, to effectively learn characteristics of novel class samples, we introduce an adaptive fine-tuning method that adaptively updates key convolutional kernels during the fine-tuning stage, enabling the model to generalize better to novel classes. Experimental results demonstrate that our approach achieves superior detection performance over state-of-the-art models on both the home-made PIP-DET dataset and the publicly available NEU-DET dataset, demonstrating its effectiveness.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06590-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143900834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep metric learning-based side-channel analysis with improved robustness and efficiency","authors":"Kaibin Li, Yihuai Liang, Hua Meng, Zhengchun Zhou","doi":"10.1007/s10489-025-06586-z","DOIUrl":"10.1007/s10489-025-06586-z","url":null,"abstract":"<div><p>Side-channel analysis (SCA) is one of the widely studied approaches for assessing vulnerabilities in cryptographic algorithm implementations. Existing deep learning (DL)-based SCA approaches are commonly dataset-specific, and their attack performance heavily depends on optimal hyperparameters and effective neural network architectures. Searching such hyperparameters and architectures could be very time-consuming. In addition, traditional machine learning (ML)-based SCA methods often require manual feature engineering, leading to information loss and limiting attack performance. To address these challenges, we propose a profiled SCA model based on deep metric learning (DML) with template attacks (TA). This novel approach improves dataset generalization, enhances feature extraction, and reduces the reliance on hyperparameters. Specifically, a normalized lifted structured (NLS) loss is designed for the proposed attack model. Then, a label-informed hybrid distance is subtly integrated into the model to enhance the model’s ability for capturing relationships between embeddings and labels, thereby improving the attack performance and robustness. Next, a similarity learning method is designed by evaluating all pairwise distances within a mini-batch, reducing sensitivity to triplet selection and improving training efficiency. Experimental results show that the proposed model significantly outperforms the state-of-the-art DL-based SCA methods. It achieves attack performance improvements of up to 50.0% and an average improvement of 37.9% on public datasets, while being 30.8% faster in network training. Comprehensive evaluations show that the proposed model provides high efficiency, robust performance, and strong generalization across diverse datasets and leakage models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VFE: A large-scale video future event description dataset for evaluating video temporal prediction","authors":"Chenghang Lai, Haibo Wang","doi":"10.1007/s10489-025-06547-6","DOIUrl":"10.1007/s10489-025-06547-6","url":null,"abstract":"<div><p>Given a video, humans can predict subsequent events in the video and generate reasonable descriptions based on the acquired information and prior knowledge. This ability requires in-depth analysis of dynamic visual information in videos and the comprehensive use of extensive world knowledge for logical reasoning and prediction. However, current visual systems have not yet reached a satisfactory level regarding similar temporal prediction capability. To evaluate this new application, we construct a dataset called VFE (Video Future Event Description), a large-scale dataset for subsequent video event prediction. The VFE dataset contains over 84K video clips, and each clip is equipped with a video and description of the premise event and a predicted description of the subsequent events. To evaluate video temporal prediction, we propose a task, video future event prediction, to generate possible future event descriptions for subsequent unseen video clips based on the premise video. In this paper, we also propose a baseline model for evaluating the VFE dataset. The experimental results indicate the challenge of this task, and the ability of the visual system in complex video temporal prediction needs to be further explored. The dataset and code are available at https://github.com/keyancaigou/VFE.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}