{"title":"Block-diagonal graph embedding for unsupervised feature selection","authors":"Kun Jiang, Zhihai Yang, Qindong Sun","doi":"10.1007/s10489-025-06558-3","DOIUrl":"10.1007/s10489-025-06558-3","url":null,"abstract":"<div><p>The aim of unsupervised feature selection (UFS) is to remove irrelevant, redundant and noisy features, which could reduce the time consumption and improve the clustering performance of learning machine. Due to the absence of label information, the major research direction of UFS models lies in how to characterize the manifold structure of high-dimensional data and generate the pseudo labels for data samples properly. With the generated label information, a faithful and compact feature subset could be produced that sufficiently preserves the intrinsic structure. In this paper, we propose a novel subspace clustering guided unsupervised feature selection (BDGFS) model. Specifically, the underlying manifold structure is captured by subspace clustering method that could adaptively preserve the cluster labels, meanwhile the salient features are selected to dominate the projected subspace. The BDGFS model can naturally preserve the multi-subspace distribution via subspace clustering and simultaneously learn the feature weight matrix which is sufficient to characterize the underling subspace structure with exact components preserving. We develop an alternative optimization strategy to solve the challenging objective function, and then discuss the convergence of the proposed algorithm. Experimental results on benchmark databases demonstrate that the BDGFS model could outperform the state-of-the-art UFS models. The code of the BDGFS model is released at https://github.com/ty-kj/BDGFS.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding and leveraging vocoder fingerprints for synthetic speech attribution","authors":"Jianpeng Ke, Lina Wang","doi":"10.1007/s10489-025-06272-0","DOIUrl":"10.1007/s10489-025-06272-0","url":null,"abstract":"<div><p>With the rapid advancements in generative adversarial networks (GANs), neural vocoders have emerged as critical components for synthesizing intelligible speech. The rise of fake audio poses significant challenges and risks to national security due to malicious abuse. Although countermeasures have been proposed to detect deepfakes, attributing audio to specific vocoder architectures remains a challenging task. Existing approaches that directly input handcrafted features into sophisticated deep neural networks (DNNs) tend to neglect the misguidance of content-relevant features, which leads to poor generalization and efficacy. In this paper, we propose a novel framework that focuses on disentangling the vocoder fingerprint from audio to identify fake audio. To this end, we introduce an audio reconstructor based on the U-Net architecture that minimizes the preservation of the content-relevant features of the original audio. The residual between the raw and reconstructed latent vectors is then calculated to eliminate content-relevant features. The residual is finally fed into a classifier to determine the vocoder’s architecture. The extensive experiments demonstrate the effectiveness of our proposed method in attributing fake audio in various cross-test setups on large-scale datasets. Additionally, we apply our approach to binary fake audio detection and observe its remarkable generalizability even with unseen vocoders.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A keyframe weighted dual-channel attention GCN model for human skeleton motion prediction","authors":"Wenwen Zhang, Jianfeng Tu, Siyu Li, Lingfeng Liu","doi":"10.1007/s10489-025-06532-z","DOIUrl":"10.1007/s10489-025-06532-z","url":null,"abstract":"<div><p>Accurate prediction of human skeletal motion sequences is critical for human activity analysis and low-latency motion reconstruction applications. While many studies focus on frame-by-frame prediction model designs, the keyframes in a motion sequence may contain more spatial-temporal information than the other keyframes do. To address the importance of keyframes, this work introduces a heterogeneous keyframe selection and fusion method to discriminate the importance of different motion frames from historical observations for prediction. Specifically, we propose an adaptive keyframe selection algorithm to iteratively select the keyframes and a nonlinear heterogeneous interpolation method to reconstruct the transitional frames. By merging them with the original motion sequence, the semantics of the original motion are preserved, and the importance of the keyframes is highlighted. A graph convolutional network (GCN) is designed for prediction with dual-channel attention to incorporate motion patterns in longer-term historical records to improve motion feature exploration. A comprehensive evaluation of the model is performed on the Human3.6M and AMASS datasets, which shows significant improvement in motion prediction over long-term methods (<span>(ge )</span> 320 ms) over the state-of-the-art methods in terms of the 3D mean per joint position error (MPJPE).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143852604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huang Wenhui, Lin Yunhan, Chen Jie, Liu Mingxin, Min Huasong
{"title":"Generalization of neural network for manipulator inverse dynamics model learning","authors":"Huang Wenhui, Lin Yunhan, Chen Jie, Liu Mingxin, Min Huasong","doi":"10.1007/s10489-025-06564-5","DOIUrl":"10.1007/s10489-025-06564-5","url":null,"abstract":"<div><p>The inverse dynamics model of manipulators learned from recurrent neural networks demonstrates higher precision than those obtained through analytical modeling methods. Variations in end-effector loads and previously unseen trajectory points can lead to inaccurate torque estimations in dynamic models of manipulators. This paper integrates innovative feature expansion, feature enhancement, and regularization into an end-to-end inverse dynamics model learning framework. The proposed model employs a bidirectional long short-term memory (BiLSTM) network, augmented by a spatial attention mechanism with Convolutional Neural Networks (CNN) and a Max-Pooling method, which enhances the extraction of latent spatial features, and a multi-scale parallel temporal attention mechanism, which captures the dynamic changes of objects in the temporal dimension. A novel motion residual vector is designed to expand features, and a motion residual module is proposed to assist the network in perceiving changes in end-effector loads. To prevent overfitting, novel spatial attention standard deviation regularization are implemented. Experimental results across different trajectories and end-effector loads validate the generalization capability of the proposed method. The proposed method is compared with five methods, experimental results across different trajectories and end-effector loads validate the generalization capability of the proposed method. It surpasses state-of-the-art methods, achieving the highest overall accuracy. In cross-validation experiments, the validation loss remains stable as the training loss decreases, demonstrating the proposed approach’s strong generalization performance in dynamics model learning.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143840348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Yu, Xin Li, Minglai Shao, Ying Sun, Wenjun Wang
{"title":"Temporal subgraph contrastive learning for anomaly detection on dynamic attributed graphs","authors":"Yang Yu, Xin Li, Minglai Shao, Ying Sun, Wenjun Wang","doi":"10.1007/s10489-025-06402-8","DOIUrl":"10.1007/s10489-025-06402-8","url":null,"abstract":"<div><p>A dynamic attributed graph exists in which features and structures evolve. Some researchers have focused on the study of anomaly detection methods under such complex evolution patterns. However, they cannot address the discrepancy problem of coupled evolution of multitemporal features, i.e., how to portray and capture the anomaly patterns under coupled evolution is a key problem that needs to be solved. Therefore, in this paper, we propose the Temporal Subgraph Contrastive Learning (TSCL) method for anomaly detection on dynamic attributed graphs, which learns node representations by sampling and comparing temporal subgraphs and uses the statistical results of multiround comparison scores to predict node anomalies. In particular, the Temporal Features Evolving module and the Temporal Subgraph Sampling module capture the coupled evolutionary patterns of features and structures, and the combination of the Temporal Contrastive Learning module and the Statistical Anomaly Estimator module implements an end-to-end working approach between representation learning and anomaly detection. Finally, extensive comparative experiments and analyses on real datasets demonstrate the effectiveness of our proposed TSCL approach for anomaly detection on dynamic attributed graphs.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143840366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hamid Saadatfar, Sayed Iqbal Nawin, Edris Hosseini Gol
{"title":"A fast approach based on divide-and-conquer for instance selection in classification problem","authors":"Hamid Saadatfar, Sayed Iqbal Nawin, Edris Hosseini Gol","doi":"10.1007/s10489-025-06541-y","DOIUrl":"10.1007/s10489-025-06541-y","url":null,"abstract":"<div><p>Instance selection is a data preprocessing method in data mining that aims to reduce the volume of the training dataset. Reducing samples from a large dataset offers benefits such as lower storage requirements, reduced computational costs, increased processing speed, and, in some cases, improved accuracy for learning algorithms. However, reducing samples from large datasets is also a challenging task due to their sheer volume. Recently, numerous instance selection methods for big data have been proposed, often facing challenges such as low accuracy and slow processing speed. In this research, we propose a fast and efficient three-step method based on the divide-and-conquer approach. In the first step, the training set is divided based on the number of classes. Next, representative summaries of each class are extracted. Finally, samples from each class are reduced independently while considering the representatives of other classes. By using a proposed ranking-based method, it is possible to accurately identify less important and noisy samples. For a comprehensive evaluation, we utilized 20 well-known large datasets and three synthetic datasets featuring challenging structures. The results demonstrate the superiority of the proposed method over four recent related methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143840347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
André Moreira Souza, Livia Lissa Kobayashi, Lucas Andrietta Tassoni, Cesar Augusto Pospissil Garbossa, Ricardo Vieira Ventura, Elaine Parros Machado de Sousa
{"title":"Deep learning solutions for audio event detection in a swine barn using environmental audio and weak labels","authors":"André Moreira Souza, Livia Lissa Kobayashi, Lucas Andrietta Tassoni, Cesar Augusto Pospissil Garbossa, Ricardo Vieira Ventura, Elaine Parros Machado de Sousa","doi":"10.1007/s10489-025-06555-6","DOIUrl":"10.1007/s10489-025-06555-6","url":null,"abstract":"<div><p>The increasing demand for animal protein products has led to the emergence of Precision Livestock Farming (PLF) and the adoption of sensing technologies, big data solutions, and Machine Learning (ML) methods in modern livestock farming. At the same time, the audio signal processing field has undergone notable advancements in recent years, transitioning from traditional techniques to more sophisticated ML approaches, with open challenges in detecting and classifying complex, low-quality, and overlapping sounds in real-world scenarios. In this paper, we evaluate deep learning methods, conceived from computer vision to attention-based approaches, for Audio Event Detection (AED) on a novel audio dataset from a swine farming environment with challenging characteristics, such as weak annotations and high amounts of noise. The primary purpose of our study is to prospect effective AED solutions for the development of tools for auditing livestock farms, which could be used to improve animal welfare. Our results show that, despite inherent limitations in the dataset’s size, class imbalance, and sound quality, Convolutional Neural Network (CNN) and attention-based architectures are respectively effective and promising for detecting complex audio events. Further research may explore avenues for optimizing model performance in similar, real-life datasets while simultaneously amplifying annotated events and reducing annotation costs, thereby enhancing the broader applicability of AED methods in diverse audio processing scenarios.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143840365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DFF-HGNN: Dual-Feature Fusion Heterogeneous Graph Neural Network","authors":"Shengen Xue, Hua Duan, Yufei Zhao, Wei Fan","doi":"10.1007/s10489-025-06480-8","DOIUrl":"10.1007/s10489-025-06480-8","url":null,"abstract":"<div><p>Heterogeneous graph neural networks (HGNNs) have gained significant attention in deep learning due to their superior capability in processing heterogeneous graph data. However, existing HGNNs often fail to explicitly leverage relational information among nodes when utilizing the attribute information of nodes for graph representation learning, thus constraining their performance. To address this limitation, we introduce two approaches for utilizing relational information explicitly: a Relation-based Feature Enhancement Strategy (RFE-Strategy) for non-attributed heterogeneous graphs, and a Dual-Feature Fusion Heterogeneous Graph Neural Network (DFF-HGNN) for attributed heterogeneous graphs. The RFE-Strategy enhances HGNNs performance on non-attributed heterogeneous graphs through a three-step process: relational feature extraction, identity feature encoding, and feature enhancement. Meanwhile, DFF-HGNN integrates both attribute and relational features to effectively capture the heterogeneity and complexity of the graph, employing four components: separate pre-transformation, intra-type feature encoder, inter-type feature encoder, and embedding update encoder. Extensive experiments on multiple benchmark datasets demonstrate that the RFE-Strategy significantly improves the performance of HGNNs, while DFF-HGNN outperforms the state-of-the-art models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rapid deployment of digital twin for life prediction of rolling bearings","authors":"Jun Wang, Lei Xiao, Ximing Liu","doi":"10.1007/s10489-025-06536-9","DOIUrl":"10.1007/s10489-025-06536-9","url":null,"abstract":"<div><p>Finite element modeling (FEM) is widely recognized as a relatively accurate approach for constructing digital twin (DT) models for predicting remaining useful life (RUL). However, FEM suffers from long computation times, high operational complexity, and an inability to meet the real-time requirements of DT. This study proposes a k-nearest neighbor Kriging Radial basis function Digital Twin (KKR-DT) system. Initially, the full working condition results of the roller bearing were calculated using Ansys software. Subsequently, a reduced-order (OR) model was developed following the agent model approach. KNN was used to find neighboring values near the OR points, and Kriging was employed to interpolate at the OR points, obtaining an OR model with a single working condition. Finally, using RBFs all single-working condition OR models were transformed into full-working condition OR models, thereby establishing a five-dimensional DT model and DT user interface. The stress-life (S–N) degradation curve of the material was used to predict the roller bearing RUL. The proposed stress field diagram addressed the challenge of reverse validation in interpolation models. Ultimately integrated as the KKR-DT system. Compared the full working condition average accuracy of KKR-DT was 96.6938%, with maximum and minimum average accuracies of 99.9993% and 99.9978%, respectively. Real-time dynamic operation calculation time for a single instance was achieved within 0.35 s. Remote DT testing was conducted using actual spinning frame equipment, to demonstrate the accuracy and real-time DT capabilities of the system, a solution is provided for the practical application of digital twins in dynamic operation and prediction.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143840497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A two-stage knowledge graph completion based on LLMs’ data augmentation and atrous spatial pyramid pooling","authors":"Na Zhou, Yuan Yuan, Lei Chen","doi":"10.1007/s10489-025-06556-5","DOIUrl":"10.1007/s10489-025-06556-5","url":null,"abstract":"<div><p>With the development of information technology, a large amount of unstructured and fragmented data is generated. Knowledge graphs can effectively integrate these fragmented data. Due to the difficulty of domain knowledge mining, knowledge graphs have problems of data sparseness and data missing. In addition, standard convolutional neural networks have limited capability in capturing feature interactions. To address data sparsity and the limitations of standard convolutional models, we propose DA-ARKGC, a two-stage knowledge graph completion model using wheat as a case study. In the first stage, to address the data sparsity problem, the rule mining data augmentation module (DA) based on large language models expands the wheat knowledge graph. In the second stage, the knowledge completion module (ARKGC) of the atrous spatial pyramid pooling with residual is introduced to achieve knowledge completion. The DA-ARKGC model was verified on the constructed wheat knowledge graph (Wheat_KG). Compared with ConvE, its MRR, Hits@1, Hits@3 and Hits@10 increased by 10% and 10.2%, 10.1% and 9.3%, respectively. In order to verify the effectiveness and generalization of the ARKGC module, experiments were conducted on the open-source datasets WN18 and FB15k. The results demonstrated that the model achieved optimal or sub-optimal performance compared with other baseline models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143835741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}