{"title":"Deep metric learning-based side-channel analysis with improved robustness and efficiency","authors":"Kaibin Li, Yihuai Liang, Hua Meng, Zhengchun Zhou","doi":"10.1007/s10489-025-06586-z","DOIUrl":"10.1007/s10489-025-06586-z","url":null,"abstract":"<div><p>Side-channel analysis (SCA) is one of the widely studied approaches for assessing vulnerabilities in cryptographic algorithm implementations. Existing deep learning (DL)-based SCA approaches are commonly dataset-specific, and their attack performance heavily depends on optimal hyperparameters and effective neural network architectures. Searching such hyperparameters and architectures could be very time-consuming. In addition, traditional machine learning (ML)-based SCA methods often require manual feature engineering, leading to information loss and limiting attack performance. To address these challenges, we propose a profiled SCA model based on deep metric learning (DML) with template attacks (TA). This novel approach improves dataset generalization, enhances feature extraction, and reduces the reliance on hyperparameters. Specifically, a normalized lifted structured (NLS) loss is designed for the proposed attack model. Then, a label-informed hybrid distance is subtly integrated into the model to enhance the model’s ability for capturing relationships between embeddings and labels, thereby improving the attack performance and robustness. Next, a similarity learning method is designed by evaluating all pairwise distances within a mini-batch, reducing sensitivity to triplet selection and improving training efficiency. Experimental results show that the proposed model significantly outperforms the state-of-the-art DL-based SCA methods. It achieves attack performance improvements of up to 50.0% and an average improvement of 37.9% on public datasets, while being 30.8% faster in network training. Comprehensive evaluations show that the proposed model provides high efficiency, robust performance, and strong generalization across diverse datasets and leakage models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"VFE: A large-scale video future event description dataset for evaluating video temporal prediction","authors":"Chenghang Lai, Haibo Wang","doi":"10.1007/s10489-025-06547-6","DOIUrl":"10.1007/s10489-025-06547-6","url":null,"abstract":"<div><p>Given a video, humans can predict subsequent events in the video and generate reasonable descriptions based on the acquired information and prior knowledge. This ability requires in-depth analysis of dynamic visual information in videos and the comprehensive use of extensive world knowledge for logical reasoning and prediction. However, current visual systems have not yet reached a satisfactory level regarding similar temporal prediction capability. To evaluate this new application, we construct a dataset called VFE (Video Future Event Description), a large-scale dataset for subsequent video event prediction. The VFE dataset contains over 84K video clips, and each clip is equipped with a video and description of the premise event and a predicted description of the subsequent events. To evaluate video temporal prediction, we propose a task, video future event prediction, to generate possible future event descriptions for subsequent unseen video clips based on the premise video. In this paper, we also propose a baseline model for evaluating the VFE dataset. The experimental results indicate the challenge of this task, and the ability of the visual system in complex video temporal prediction needs to be further explored. The dataset and code are available at https://github.com/keyancaigou/VFE.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Runsen Jiang, Jiajin Huang, Yadong Xiao, Jian Yang
{"title":"Learning unified denoised representations for sequential recommendation","authors":"Runsen Jiang, Jiajin Huang, Yadong Xiao, Jian Yang","doi":"10.1007/s10489-025-06597-w","DOIUrl":"10.1007/s10489-025-06597-w","url":null,"abstract":"<div><p>Sequential Recommendation (SR) has been widely used in many internet applications, such as e-commerce, social platforms and hot news. User behavior sequential data in SR typically contains complex patterns of short-term dependencies, long-term dependencies and noise, which may lead SR models to misinterpret user intentions and overfit noisy patterns, but existing methods cannot solve them simultaneously. To address this problem, we propose a model to learn <b>Uni</b>fied <b>D</b>enoised <b>R</b>epresentations (<b>UniDR</b>) for the SR task, which consists of three modules. The first module employs Graph Neural Networks (GNNs) with adaptive learning mechanisms to capture short-term dependencies by dynamically weighting item transitions. The second module utilizes self-attention mechanisms to effectively model long-term dependencies across the entire item sequence. The third module focuses on extracting long-term sequential patterns from contextual information through feed-forward networks. Each module independently generates denoised representations, either through weak edge removal in GNNs or through frequency domain transformations. UniDR integrates these denoised representations by jointly optimizing a BPR loss, an alignment loss and a uniformity loss. Extensive experiments on five public benchmark datasets demonstrate UniDR’s superiority in recommendation performance and robustness to interaction noise. Compared to the strongest state-of-the-art baseline, UniDR achieves significant improvements, with average increases of 10.63% in Hit Rate (HR), 21.47% in Normalized Discounted Cumulative Gain (NDCG) and 23.71% in Mean Reciprocal Rank (MRR).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two-stage effective attentional generative adversarial network","authors":"Mingyu Jin, Qinkai Yu, Chong Zhang, Haochen Xue, Shuliang Zhao","doi":"10.1007/s10489-025-06576-1","DOIUrl":"10.1007/s10489-025-06576-1","url":null,"abstract":"<div><p>Although GAN models have succeeded in relevant tests, text-to-image modelling using GANs to synthesize high-quality images is still challenging. Existing multi-stage models face several problems: first, the scale is too large, and the model has a large number of redundant structures. Second, the model often generates duplicate images without progress and cannot update the parameters efficiently. In this paper, we propose a two-stage model to solve the above problem. 1)We remove the redundancy structure and use an improved network structure that reduces the scale of the model size. 2)Our method employs a model trained in two stages instead of simultaneously, which shortens the training time and ensures that the model does not have vanishing gradients or mode collapse. In addition, we added an attention mechanism to the model to help optimize details. Experimental results show that our model saw excellent results in terms of generation quality and reduced model size on CUB(IS 4.83, FID 15.13) and COCO dataset(FID 33.74).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberto Martínez-Cruz, Debanjan Mahata, Alvaro J. López-López, José Portela
{"title":"Enhancing keyphrase extraction from long scientific documents using graph embeddings","authors":"Roberto Martínez-Cruz, Debanjan Mahata, Alvaro J. López-López, José Portela","doi":"10.1007/s10489-025-06579-y","DOIUrl":"10.1007/s10489-025-06579-y","url":null,"abstract":"<div><p>This study explores the integration of graph neural network (GNN) representations with pre-trained language models (PLMs) to enhance keyphrase extraction (KPE) from lengthy documents. We demonstrate that incorporating graph embeddings into PLMs yields richer semantic representations, especially for long texts. Our approach constructs a co-occurrence graph of the document, which we then embed using a graph convolutional network (GCN) trained for edge prediction. This process captures non-sequential relationships and long-distance dependencies, both of which are often crucial in lengthy documents. We introduce a novel <i>graph-enhanced</i> sequence tagging architecture that combines PLM-based contextual embeddings with GNN-derived representations. Through evaluations on benchmark datasets, our method outperforms state-of-the-art models, showing notable improvements in F1 scores. Beyond performance on standard benchmarks, this approach also holds promise in domains such as legal, medical, and scientific document processing, where efficient handling of long texts is vital. Our findings underscore the potential for GNNs to complement PLMs, helping address both technical and real-world challenges in KPE for long documents.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Group equivariant learning for few-shot image classification","authors":"Meijuan Su, LeiLei Yan, Fanzhang Li","doi":"10.1007/s10489-025-06546-7","DOIUrl":"10.1007/s10489-025-06546-7","url":null,"abstract":"<div><p>Few-shot learning, as an effective approach to solve image classification problems in data-scarce scenarios, has made significant progress in recent years, with numerous methods emerging. These methods typically use convolutional neural networks (CNNs) as feature extractors and classify other data based on the features of a small number of labeled samples. The reason CNNs have become the preferred method for image processing tasks is primarily due to their translational equivariance. However, conventional CNNs lack inherent mechanisms to handle other symmetry transformations (such as rotation and reflection), resulting in reduced classification performance of the model, especially in few-shot scenarios. To address this problem, we leverage the advantages of group convolutions in handling broader symmetric transformations, integrating them into few-shot learning tasks, and accordingly propose a group-equivariant prototypical learning network. This method maps samples into the group space via a group convolution module, enhancing the model’s ability to handle various symmetry transformations present in classification targets within images, thereby improving its feature representation capability. Additionally, we designed a new contrastive loss that can naturally be co-optimized with cross-entropy loss, guiding the model to learn a highly discriminative group feature space. The experimental results on the miniImageNet, CIFAR-FS, and CUB-200 datasets show that the GEPL method significantly improves classification performance, thus verifying the effectiveness of our method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143892581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging subdomain alignment for enhanced anomaly detection in time series","authors":"Bo Chen, Min Fang, HaiXiang Li, GuiZhi Wang","doi":"10.1007/s10489-025-06589-w","DOIUrl":"10.1007/s10489-025-06589-w","url":null,"abstract":"<div><p>Time series anomaly detection focuses on identifying anomalies in continuously collected data at each time step. Developing an effective detection model requires not only accurate anomaly identification but also adaptability to the dynamic changes inherent in time series data. Current research primarily employs deep learning networks with task-specific reconstruction or prediction objectives to learn the underlying patterns of normal data. However, the distribution of complex time series data often shifts subtly over time, leading to evolving normal patterns. These distributional shifts make it difficult for models to establish clear decision boundaries, as they often fail to recognize such changes. To address these challenges, this paper proposes Subdomain Alignment for Enhanced Anomaly Detection in Time Series (SA-EADTS). This method aligns the latent distributions of unknown subdomains using a sensitive distance, enabling anomaly detection on unseen data distributions. Extensive experiments on four real-world datasets demonstrate that SA-EADTS significantly outperforms state-of-the-art baseline methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Key class identification: a comprehensive dataset and a new GNN model","authors":"Shizhou Wang, Yuhang Chen, Liangyu Chen","doi":"10.1007/s10489-025-06574-3","DOIUrl":"10.1007/s10489-025-06574-3","url":null,"abstract":"<div><p>Program comprehension is a critical task in software maintenance. As the scale of codebases expands, the required human effort increases exponentially. Key Class Identification (KCI) offers an effective solution to this challenge. Despite this, the absence of standardized benchmarks and the lack of robustness in most existing metric-based approaches across different software systems are major obstacles. In this paper, we first construct a comprehensive dataset to objectively evaluate KCI performance. Inspired by ensemble learning, we introduce a voting method to address key class labeling, representing the primary challenge in dataset construction. Additionally, we propose a novel GNN model that leverages graph transformer to capture information from directed class dependency networks for key class identification. Extensive experiments conducted on 170 software systems in our benchmark demonstrate that our approach achieves high accuracy of up to 93.1%, outperforming existing metric-based methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning nested attentional feature fusion network for high performance visual tracking","authors":"Peng Gao, Xin-Yue Zhang, Tao Yu","doi":"10.1007/s10489-025-06588-x","DOIUrl":"10.1007/s10489-025-06588-x","url":null,"abstract":"<div><p>Siamese network-based visual tracking has made significant progress in recent years, with correlation calculations playing a central role in these models. However, the inherently linear and localized nature of correlation often leads to substantial semantic information loss and convergence to local optima, thereby limiting the potential for further performance improvements. To address these challenges, we propose a feature fusion network inspired by the Transformer architecture, incorporating nested attention mechanisms to enhance tracking accuracy and robustness. Unlike standard Transformer-based models, our approach refines correlation accuracy by emphasizing correct matches while attenuating incorrect ones through nested attentional representation learning. This enables more effective feature aggregation and information propagation. Our feature fusion network consists of four interdependent modules: ego-context augmentation, short-term feature augmentation, long-term feature augmentation, and cross-feature augmentation. These modules collaboratively fuse features from target templates and search regions, producing semantically rich feature maps superior to those generated by traditional correlation methods. Built on this framework, our proposed model, AiATransT, achieves state-of-the-art performance on five benchmark datasets, validated by extensive experimental evaluations.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143883575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A transfer-based decision-making method based on expert risk attitude and reliability","authors":"Xuefei Jia, Chao Fu, Wenjun Chang","doi":"10.1007/s10489-025-06548-5","DOIUrl":"10.1007/s10489-025-06548-5","url":null,"abstract":"<div><p>Attributed to emerging information technologies in the current era, historical data have been gradually accumulated in the process of people making decisions, in which people’s preferences are characterized by the data. These accumulated data are beneficial for generating decision recommendations. A small volume of historical data, unfortunately, may not actually characterize people’s preferences and be difficult to generate convinced decision recommendations. To address decision-making problems in this context, a transfer-based decision-making method is proposed based on the idea of parameter transfer given that experts’ risk attitudes and reliabilities are adopted to characterize their preferences. Characterized by the orness degree in the ordered weighted averaging operator, an expert’s risk attitude is identified by minimizing the average distance between overall assessments and their predictions on the historical dataset. An expert’s decision accuracy and internal consistency are defined on the historical dataset and combined to identify the expert’s reliability. With the source domain selected by experts’ reliabilities, a transfer model is constructed, in which experts’ risk attitudes are transferred between source and target domains. The effectiveness of the proposed method is validated by its application in the auxiliary diagnosis of breast lesions, its comparison with different methods, and its ablation experiment.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 10","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143888772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}