Applied Intelligence最新文献_第8页

Instance-aware context with mutually guided vision-language attention for referring image segmentation 具有相互引导的视觉语言注意力的实例感知上下文，用于参考图像分割

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-30 DOI: 10.1007/s10489-025-06851-1

Qiule Sun, Jianxin Zhang, Bingbing Zhang, Peihua Li

{"title":"Instance-aware context with mutually guided vision-language attention for referring image segmentation","authors":"Qiule Sun, Jianxin Zhang, Bingbing Zhang, Peihua Li","doi":"10.1007/s10489-025-06851-1","DOIUrl":"10.1007/s10489-025-06851-1","url":null,"abstract":"<div><p>Referring image segmentation, which integrates both visual and linguistic modalities, represents a forefront challenge in cross-modal visual research. Traditional approaches generally fuse linguistic features with visual data to generate multi-modal representations for mask decoding. However, these methods often mistakenly segment visually prominent entities rather than the specific region indicated by the referring expression, as the visual context tends to overshadow the multi-modal features. To address this, we introduce IMNet, a novel referring image segmentation framework that harnesses the Contrastive Language-Image Pre-training (CLIP) model and incorporates a mutually guided vision-language attention mechanism to enhance accuracy in identifying the referring mask. Specifically, our mutually guided vision-language attention mechanism consists of language-guided attention and vision-guided attention, which model bi-directional relationships between vision and linguistic features. Additionally, to accurately segment instances based on referring expressions, we develop an instance-aware context module within the decoder that focuses on learning instance-specific features. This module connects instance prototypes with corresponding features, using linearly weighted prototypes for final prediction. We evaluate the proposed method on three publicly available datasets, i.e., RefCOCO, RefCOCO+, and G-Ref. Comparisons with previous methods demonstrates that our approach achieves competitive performance.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Educational knowledge graph based intelligent question answering for automatic control disciplines 基于教育知识图谱的自动控制学科智能问答

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-30 DOI: 10.1007/s10489-025-06847-x

Zhiwei Cai, Nuoying Xu, Linqin Cai, Bo Ren, Yu Xiong

{"title":"Educational knowledge graph based intelligent question answering for automatic control disciplines","authors":"Zhiwei Cai, Nuoying Xu, Linqin Cai, Bo Ren, Yu Xiong","doi":"10.1007/s10489-025-06847-x","DOIUrl":"10.1007/s10489-025-06847-x","url":null,"abstract":"<div><p>With the further development of education informatization, Educational Knowledge Graph (EKG) based intelligent Question Answering (KGQA) has attracted significant attention in smart education. However, current educational KGQA faces enormous challenges, such as the incomplete questions from students, the dispersed knowledge from EKG, and the scarce and imbalanced dataset. In this paper, a novel educational KGQA model was proposed for answering student’s questions on automatic control disciplines. Firstly, a topic entity detection algorithm was constructed based on BERT-BiLSTM-CRF and domain dictionary, and an intention recognition algorithm was built based on BERT and TextCNN to accurately locate the topic entity by formulating entity priority, entity completion rules, and similarity calculation. Then, a custom weighted cross-entropy loss function (CCL) was designed to alleviate the influence of imbalanced samples in the training dataset on the model classifier. In addition, the first Chinese dataset for educational KGQA in automatic control disciplines (ACKGQA) was constructed. Finally, extensive experiments are performed to evaluate the effectiveness and generalizations of the proposed KGQA model on the ACKGQA dataset and five benchmark public datasets. The proposed KGQA obtains the recognition precision of 87.5% and the recall of 86.25% on the ACKGQA dataset and exhibits better overall performance on other five benchmark datasets. Experimental results demonstrate that our educational KGQA model can achieve outstanding performance when facing the challenges posed by imbalanced datasets inherent in educational knowledge graphs.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Balanced Loss Function for Long-tailed Semi-supervised Ship Detection 长尾半监督船舶检测的平衡损失函数

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-30 DOI: 10.1007/s10489-025-06838-y

Li-Ying Hao, Jia-Rui Yang, Yunze Zhang

{"title":"Balanced Loss Function for Long-tailed Semi-supervised Ship Detection","authors":"Li-Ying Hao, Jia-Rui Yang, Yunze Zhang","doi":"10.1007/s10489-025-06838-y","DOIUrl":"10.1007/s10489-025-06838-y","url":null,"abstract":"<div><p>Semi-supervised learning (SSL) has significantly reduced the reliance of the ship detection network on labeled images. However, the more realistic and challenging issue of long-tailed distribution in SSL remains largely unexplored. While most existing methods address this issue at the instance level through reweighting or resampling techniques, their performance is significantly limited by their dependence on biased backbone representations. To overcome this limitation, we propose a Balanced Loss function (Bal Loss). Our approach consists of three key components. First, we introduce the BaCon Loss, which computes class-wise feature centers as positive anchors and selects negative anchors through a simple yet effective mechanism. Second, we posit an assumption that the normalized features in contrastive learning follow a mixture of von Mises-Fisher (vMF) distributions in the unit space. This assumption allows us to estimate the distribution parameters using only the first sample moment, which can be efficiently computed in an online manner across different batches. Finally, we incorporate a Jitter-Bagging module, adapted from prior literature, to provide precise localization information, thereby refining bounding box predictions. Extensive experiments demonstrate the efficacy of Bal Loss, achieving SOTA results on ship datasets with a 3.9 improvement over the baseline. Notably, our method attains an <span>(AP^{r})</span> of 44.1 on the ShipRSImageNet dataset, underscoring its robust detection capabilities.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MDT: A multiscale differencing transformer with sequence feature relationship mining for robust action recognition 基于序列特征关系挖掘的多尺度差分变压器鲁棒动作识别

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-30 DOI: 10.1007/s10489-025-06861-z

Zengzhao Chen, Fumei Ma, Hai Liu, Wenkai Huang, Tingting Liu

{"title":"MDT: A multiscale differencing transformer with sequence feature relationship mining for robust action recognition","authors":"Zengzhao Chen, Fumei Ma, Hai Liu, Wenkai Huang, Tingting Liu","doi":"10.1007/s10489-025-06861-z","DOIUrl":"10.1007/s10489-025-06861-z","url":null,"abstract":"<div><p>Skeleton-based action recognition, which analyzes joint coordinates and bone connections to classify human actions, is important in understanding and analyzing human dynamic behaviors. However, actions in complex scenes have a high degree of similarity and variability, with the dynamic changes in human skeletons and subtle temporal variations in particular posing significant challenges to the accuracy and robustness of action recognition systems. To mitigate these challenges, we propose a novel multiscale differencing transformer (MDT) with sequence feature relationship mining for robust action recognition. MDT effectively mines inter-frame timing information and feature distribution differences across multiple scales, enabling a deeper understanding of the nuances between actions. Specifically, we first propose multiscale differential self-attention to handle the need for understanding action changes across multiple time scales, improving the capacity of the model to effectively capture the global and local dynamic features of actions. Then, we introduce a sequence feature relationship mining module to address complex data patterns in scenes that may span multiple sequences, exhibiting both similar and distinct characteristics. By utilizing coarse- and fine-grained sequence information, this module empowers the model to recognize intricate data patterns. On the NTU RGB+D 60 dataset, the proposed MDT model outperforms the recent STAR-Transformer by 1.6% on the Cross-Subject (CS) setting and 1.1% on the Cross-View (CV) setting, demonstrating its consistent effectiveness across different evaluation protocols.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multiview unsupervised domain adaptation through consensus augmented masking for subspace alignment 基于一致性增强掩模的多视图无监督域自适应子空间对齐

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-30 DOI: 10.1007/s10489-025-06834-2

Chenyang Zhu, Weibin Luo, Yunxin Xie, Lipei Fu

{"title":"Multiview unsupervised domain adaptation through consensus augmented masking for subspace alignment","authors":"Chenyang Zhu, Weibin Luo, Yunxin Xie, Lipei Fu","doi":"10.1007/s10489-025-06834-2","DOIUrl":"10.1007/s10489-025-06834-2","url":null,"abstract":"<div><p>Unsupervised Domain Adaptation (UDA) focuses on bridging the gap between source and target domain distributions. Existing UDA approaches often struggle to capture the diverse contextual dependencies required to address ambiguities in visual feature representations. To overcome these challenges, we propose a framework called Consensus Augmented Masking for Subspace Alignment (CAMSA) that leverages multiview representations to enhance contextual diversity and establish a consensus subspace for improved domain alignment. Firstly, multiple models are independently trained with distinct masking augmentations to ensure prediction consistency and extract specialized multiview features, each capturing a unique contextual perspective. These multiview features are unified into a low-rank structure via sparse subspace representation, enabling cross-view consensus and robust domain alignment. The unified representation is further optimized by constructing a consensus affinity matrix, which facilitates the learning of a projection matrix to embed multiview features into a latent subspace. Within this latent space, source domain prototypes and <i>k</i>-means clustering on the target domain are used to estimate conditional probabilities for downstream tasks. Extensive empirical evaluations on standard benchmark datasets highlight the exceptional performance of CAMSA, consistently surpassing state-of-the-art UDA methods across a variety of architectures and configurations, underscoring the importance of leveraging diverse contextual views for robust domain alignment.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144920519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Industrial-application-oriented 2D image and 3D object anomaly detection technology: a comprehensive review 面向工业应用的二维图像和三维物体异常检测技术综述

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-28 DOI: 10.1007/s10489-025-06689-7

Gang Li, Chengrun Jiang, Min Li, Jiachen Li, Delong Han, Mingle Zhou

{"title":"Industrial-application-oriented 2D image and 3D object anomaly detection technology: a comprehensive review","authors":"Gang Li, Chengrun Jiang, Min Li, Jiachen Li, Delong Han, Mingle Zhou","doi":"10.1007/s10489-025-06689-7","DOIUrl":"10.1007/s10489-025-06689-7","url":null,"abstract":"<div><p>With the rapid development of deep learning technology, industrial anomaly detection technology has significantly improved its ability to handle large-scale images and point clouds. It has gradually been applied to complex industrial environments. However, current reviews of anomaly detection technology are often technology-oriented, and there is still a need for a systematic classification for practical industrial scenarios. Given these considerations, we will summarize and categorize the latest anomaly detection technologies from the perspective of specific industrial application scenarios, including 2D image anomaly detection, 3D object anomaly detection, and datasets. This application-oriented classification method can more effectively meet the practical needs of anomaly detection tasks in industrial production. Furthermore, we contribute to anomaly detection technology by delivering a comprehensive analysis of the current state and challenges in industrial anomaly detection, offering insights into the customization of deep learning for real-world industrial applications, and presenting an outlook for future research directions.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ExQUAL: an explainable quantum machine learning classifier 一个可解释的量子机器学习分类器

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-28 DOI: 10.1007/s10489-025-06732-7

Karuna Kadian, Sunita Garhwal, Ajay Kumar

引用次数: 0

Knee-cartilage segmentation from MR images using Multi-view Hypergraph Convolutional Neural Networks 基于多视图超图卷积神经网络的MR图像膝关节软骨分割

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-28 DOI: 10.1007/s10489-025-06808-4

Christos Chadoulos, John Theocharis, Andreas Symeonidis, Serafeim Moustakidis

{"title":"Knee-cartilage segmentation from MR images using Multi-view Hypergraph Convolutional Neural Networks","authors":"Christos Chadoulos, John Theocharis, Andreas Symeonidis, Serafeim Moustakidis","doi":"10.1007/s10489-025-06808-4","DOIUrl":"10.1007/s10489-025-06808-4","url":null,"abstract":"<div><p>Leveraging the increased capacities of hypergraphs to model complex data structures, we propose in this article the Multi-view Hyper-Graph Convolutional Network <i>(MVHGCN)</i> to yield automated knee-joint cartilage segmentations from MRIs. The main properties of our approach are presented as follows: 1) Node features are obtained from multi-view <i>(MV)</i> acquisitions, corresponding to different feature extractors or image modalities. 2) Node embeddings are generated using a distributive <i>MV</i> convolution scheme which combines the various view-specific convolutions. These results are aggregated via an attention-based fusion module to automatically learn the weights of the different views. 3) Our model integrates both local and global level learning, simultaneously. Local hypergraph convolutions explore the relationships across the spatially aligned node libraries, while global hypergraph convolutions search for global affinities between nodes located at different positions within the image. 4) We propose two different blending schemes to combine local and global convolutions, namely, the cross-talk <i>(CT)</i> and the collaborative <i>(COL)</i> blending units, respectively. Using these units as building blocks, we construct the <i>MVHGCN</i> model, a deep network with enhanced feature representation and learning capabilities. The suggested segmentation method is evaluated on the publicly available Osteoarthritis Initiative <i>(OAI)</i> cohort. Specifically, we have designed a thorough experimental setup, including parameter sensitivity analysis and comparative results against a series of existing traditional methods, deep <i>CNN</i> models, and graph convolutional networks. The results show that <i>MVHGCN</i> outperforms the competing methods, achieving an overall cartilage segmentation score of <span>(mathcal {DSC} = 95.81%)</span> and <span>(mathcal {DSC} = 96.33%)</span>, for the <i>CT</i> and the <i>COL</i> blending, respectively.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-06808-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144909752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Correction to: Dynamic preventive maintenance strategy for a heterogeneous multi-unit redundant system: A deep reinforcement learning approach with weighted network estimator 修正：异构多单元冗余系统的动态预防性维护策略：一种带加权网络估计器的深度强化学习方法

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-28 DOI: 10.1007/s10489-025-06804-8

Deming Xu, Yan Wang, Xiang Liu, Hao Ma, Zhicheng Ji

引用次数: 0

Deep reinforcement learning with graph attention mechanism for vehicle routing problem with time windows 基于图注意机制的深度强化学习求解带时间窗的车辆路径问题

IF 3.5 2区计算机科学

Applied Intelligence Pub Date : 2025-08-27 DOI: 10.1007/s10489-025-06829-z

Fan Zhang, Huiling Hu, Yuqian Zhao

{"title":"Deep reinforcement learning with graph attention mechanism for vehicle routing problem with time windows","authors":"Fan Zhang, Huiling Hu, Yuqian Zhao","doi":"10.1007/s10489-025-06829-z","DOIUrl":"10.1007/s10489-025-06829-z","url":null,"abstract":"<div><p>As the logistics industry expands, the complexity of vehicle routing problems, particularly those with time window constraints, increases with the growing demand for services. The challenge of vehicle routing problems with time windows (VRPTW) lies in efficiently scheduling a fleet of vehicles to service a set of customers within specified time frames. This study introduces a deep reinforcement learning approach based on attention mechanisms to optimize vehicle routing and scheduling, aiming to meet specific time window requirements of customers while effectively reducing travel distances and costs, thereby enhancing the efficiency of logistics delivery. This method models the problem as a Markov decision process, defines actions, states, and rewards, and uses reinforcement learning for training to extract node information features and generate preliminary solutions. The model can focus on key information and optimize strategy selection by introducing an encoding-decoding structure and attention map neural network. Then, the large neighborhood search algorithm is used to iterative optimize the initial solution to obtain the optimal solution. The model is trained and tested on the Solomon data set. The experimental results show that the model is significantly better than other methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 13","pages":""},"PeriodicalIF":3.5,"publicationDate":"2025-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144905155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0