{"title":"Partial multi–label learning with local reconstruction and indirect guidance","authors":"Yanan Wu, Yu Chen, Hui Chen, Yu Wang, Xiaozhao Fang, Guoxu Zhou, Zhouqiang Qiu, Junqiu Fan","doi":"10.1007/s10489-026-07250-w","DOIUrl":"10.1007/s10489-026-07250-w","url":null,"abstract":"<div>\u0000 \u0000 <p>In partial multi-label learning (PML), instances are associated with candidate label sets containing both partial ground-truth labels and noisy labels. This introduces two key challenges: 1) weakened label-instance associations, which act as incorrect supervision and negatively impact classification performance; 2) label correlations can effectively enhance classifier robustness, yet label correlations extracted directly from noisy label information are inherently biased and imprecise. This paper proposes PML-LRIG, a novel method based on local reconstruction and indirect guidance to address these challenges. Our approach focuses on restoring local label-instance associations and extracting reliable label correlations to guide the classifier. We integrate instance clustering with category and geometric correlations to recover labels and generate label distributions. To handle complex distributions, we design a framework for high-order label correlation learning in PML. Additionally, we implement a dual-module classifier where one module learns sample-specific features while the other captures high-order label correlations, indirectly enhancing the classifier’s generalization capabilities. Extensive experiments show that PML-LRIG outperforms state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147797069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DyStaFusion: Dynamic State-Space fusion network for multimodal tourist emotion dynamics prediction in social media","authors":"Enming Zhang, Lin Zhou","doi":"10.1007/s10489-025-06990-5","DOIUrl":"10.1007/s10489-025-06990-5","url":null,"abstract":"<div>\u0000 \u0000 <p>With the rapid spread of social networks, understanding the emotional dynamics of tourists becomes a decisive topic of research in the field of affective computation and social network analysis. However, most current approaches are largely based on unimodal text data that allows the spirit of rich multimodal information, this limits the accuracy of forecasts and time modeling capability. To address these limitations, this paper proposes DyStaFusion (Dynamic State-Space and Transformer Fusion for Multimodal Emotion Tracking), a multimodal model of emotional and dynamic prediction, which is a multimodal fusion module, an integrated enhanced transformer with long-term dependency improvement, and a Mamba-State spatial module. Thanks to the combination of text, acoustics and visual characteristics, DyStaFusion effectively detects short-term fluctuations and long-term trends in emotional states. The advanced transformer module models time dependencies through modularity, while the Mamba state space module reinforces dynamic sequence representation. These results, which were performed on the MELD and Sentiment140 data sets, show that DyStaFusion achieves consecutive improvements over several ultra-modern baselines, with an overall productivity increase of 3 to 5% in emotion classification accuracy, emotion intensity prediction, and dynamic emotion modeling. The ablation studies continue to confirm the additional contribution of each module, and highlight the importance of multimodal fusion and the overall construction of the state-space transformer. These results show that DyStaFusion offers a reliable and widespread system for multimodal dynamic prediction of emotions, with significant potential for practical applications and future research.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147797070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yaoying Wang, Kaiwu Zhang, Shiqiang Du, Wenxu Zhang, Yuqing Shi
{"title":"Global-relationship-aware multi-view clustering via random walk with restart","authors":"Yaoying Wang, Kaiwu Zhang, Shiqiang Du, Wenxu Zhang, Yuqing Shi","doi":"10.1007/s10489-026-07253-7","DOIUrl":"10.1007/s10489-026-07253-7","url":null,"abstract":"<div>\u0000 \u0000 <p>Although contrastive learning-based multi-view clustering has advanced recently, it still struggles to capture high-order structural relations and to balance global representations with local neighborhood preservation. To address these problems, we propose a novel multi-view clustering method: Global-Relationship-Aware Multi-View Clustering via Random Walk with Restart (GMCR). Specifically, Random Walk with Restart (RWR) is applied at both the feature and cluster assignment levels to capture global relationships from multiple perspectives while preserving local neighborhood structures. Furthermore, we introduce a structure-guided dual contrastive learning mechanism. It aligns features and cluster assignments using global relational information, thereby enhancing the similarity of structurally correlated samples. In addition, the proposed method is a flexible representation learning module, which is capable of being integrated into the incomplete multi-view clustering task. Extensive experiments verify the effectiveness of the presented method on both incomplete and complete datasets.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147796512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visual pattern mining with similarity metrics for model-free trading in the Korean futures market","authors":"Juhyeon Jang, Jaeyun Kim, David Enke","doi":"10.1007/s10489-026-07254-6","DOIUrl":"10.1007/s10489-026-07254-6","url":null,"abstract":"<div>\u0000 \u0000 <p>The fractal market hypothesis highlights multi-scale dynamics in financial time series and provides a theoretical foundation for pattern-based analysis. This study proposes a model-free visual pattern mining framework that transforms high-frequency market data into image representations to support intelligent decision-making. By converting 1-minute KOSPI200 futures data into candlestick chart and Bollinger band images, the method effectively captures structural patterns and volatility dynamics. The framework applies similarity metrics and Intersection over Union (IoU)-based visual comparison to identify historically similar patterns and generate intelligent trading signals without model training or complex parameter tuning. Experimental results demonstrate that combining visual features of candlestick and Bollinger bands achieves a cumulative return of 11.676% and a maximum drawdown of 4.283%, with a payoff ratio of 1.322 and a profit factor of 1.135. These findings suggest that visual pattern mining with similarity metrics offers a practical, interpretable, and robust approach to intelligent decision-making in high-frequency market environments.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147796513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DGKD: Depth-Guided knowledge distillation network for monocular 3D object detection","authors":"Xinyu Zhang, Qiang Ling","doi":"10.1007/s10489-026-07242-w","DOIUrl":"10.1007/s10489-026-07242-w","url":null,"abstract":"<div>\u0000 \u0000 <p>Monocular 3D object detection is a challenging task of autonomous driving. Its performance is limited by its inherent depth ill-posed problem. To alleviate that problem, some methods exploit the depth information of LiDAR through knowledge distillation. However, existing methods with knowledge distillation usually apply strict alignment for feature distillation, and could lead to suboptimal performance because attributes of different modals may not be well aligned. To resolve this feature misalignment issue, we propose DGKD, a novel monocular 3D object detection network with Depth-Guided Knowledge Distillation to fully leverage the depth information from LiDAR. Specifically, for feature distillation, we propose global relation distillation to capture structural information of the feature maps for better knowledge transfer from the teacher network to the student network. For response distillation, we present Wasserstein Distance-based logit distillation to perform cross-category comparison, which is a promising alternative of commonly-used Kullback-Leibler Divergence (KL-Div) method. Extensive experiments on the KITTI dataset demonstrate that our DGKD achieves competitive performance.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leverage visual attention and multimodal scene graph for visual navigation","authors":"Zengkai Wang, Cong Li, Kang Zhou","doi":"10.1007/s10489-026-07222-0","DOIUrl":"10.1007/s10489-026-07222-0","url":null,"abstract":"<div><p>Visual navigation task seeks the optimal mapping from visual observation to the agent actions with given target object. How to improve perception ability and generalization ability in the unseen environment is a challenging problem. This paper presents an end-to-end navigation framework that leverages multimodal scene graphs (MSG) to enhance the visual representation and policy for robust visual navigation. MSG integrate scene graph information, previous navigation actions, training trajectories, and target objects, enabling agents to navigate efficiently towards target objects in unseen environments. Our approach distills valuable spatial and semantic clues from MSG, improving navigation performance by <span>(10.1%)</span> in SPL and <span>(25.4%)</span> in success rate compared to baselines. By utilizing graph attention networks, we demonstrate the effectiveness of MSG in generalizing across scenes, resulting in faster and more accurate target localization.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rethinking urban region representation: adaptive soft-thresholding and attentive graph learning for robust data fusion","authors":"Weiliang Chen, Qianqian Ren, Jinbao Li","doi":"10.1007/s10489-026-07238-6","DOIUrl":"10.1007/s10489-026-07238-6","url":null,"abstract":"<div>\u0000 \u0000 <p>Representing urban regions accurately and comprehensively is essential for various urban planning and analysis tasks. Recently, with the expansion of the city, modeling long-range spatial dependencies with multiple data sources plays an important role in urban region representation. In this paper, we propose the Attentive Graph Enhanced Region Representation Learning (ATGRL) model, which aims to capture comprehensive dependencies from multiple graphs and learn rich semantic representations of urban regions. Specifically, we propose a graph-enhanced learning module to construct regional graphs by incorporating mobility flow patterns, point of interests (POIs) functions, and check-in semantics with noise filtering. Then, we present a multi-graph aggregation module to capture both local and global spatial dependencies between regions by integrating information from multiple graphs. In addition, we design a dual-stage fusion module to facilitate information sharing between different views and efficiently fuse multi-view representations for urban region embedding using an improved linear attention mechanism. Finally, extensive experiments on real-world datasets for three downstream tasks demonstrate the superior performance of our model compared to state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongming Li, Wenqiang Zhao, Fan Li, Jie Ma, Xiaoheng Zhang, Pin Wang, Yinghua Shen
{"title":"Kernel Transposed Projection Envelope Linear Discriminant Analysis Mode","authors":"Yongming Li, Wenqiang Zhao, Fan Li, Jie Ma, Xiaoheng Zhang, Pin Wang, Yinghua Shen","doi":"10.1007/s10489-026-07225-x","DOIUrl":"10.1007/s10489-026-07225-x","url":null,"abstract":"<div>\u0000 \u0000 <p>As a classical dimensionality reduction method, linear discriminant analysis (LDA) is still a research hotspot. Although current researches on LDA improvement have achieved significant results, existing LDA algorithms still have limitations. Specifically, they focus on the dimensionality reduction of the original samples, without considering the correlation information among similar samples. To address this issue, this paper proposes a novel LDA mode—kernel transposed projection envelope LDA mode (KTPE-LDA-M). First, a transposed projection envelope transformation (TPET) algorithm is designed to extract correlation information among similar samples and enrich this information into the new generated samples—envelope samples. Then, to improve the efficiency of extracting nonlinear correlation information, the TPET is kernelized, resulting in the kernel transposed projection envelope transformation (KTPET) algorithm. Finally, the envelope samples generated by the KTPET are input into LDA, enabling dimensionality reduction based on the correlation information among similar samples. The experimental results demonstrate that the proposed mode can improve LDA’s performance apparently, with an average classification accuracy increase of 4.08%. These results indicate that the proposed mode is effective, and it is both necessary and advantageous to consider the correlation information among similar samples during LDA modeling.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Group consensus decision method for probabilistic language complex project scheme based on cloud model and Q-learning algorithm","authors":"Jikai Wang, Zhiran Qiu, Yajie Dou, Weijun Ouyang, Kewei Yang, Yuejin Tan","doi":"10.1007/s10489-026-07117-0","DOIUrl":"10.1007/s10489-026-07117-0","url":null,"abstract":"<div>\u0000 \u0000 <p>To enhance the scientific rigor of complex project justification and effectively coordinate project resources and capabilities, this paper proposes a group consensus decision-making method for probabilistic linguistic complex project schemes based on a cloud model and the Q-learning algorithm. First, a probabilistic linguistic term set (PLTS) is used to portray the information preference of decision makers (DMs). To solve the overly subjective problem of optimal and worst attribute selection of the Best–worst method (BWM) method, the Bayesian-BWM method is extended to PLTS to determine the attribute weights. At the same time, the consensus and hesitation degrees of decision-making subjects are integrated to determine the DM weights. The Q-learning algorithm is introduced to optimize the adjustment strategy of DMs' ratings, and the evaluation information of DMs is adjusted according to the reward function, so that the evaluation among DMs reaches a consensus. Then, based on PLTS and cloud model characteristic parameters, the ambiguity and uncertainty of decision-making information are quantified, and the cloud model of the complex project scheme is constructed. The scheme ranking is performed by the cloud similarity technique for order preference by similarity to ideal solution method. Finally, the effectiveness and feasibility of the method in this paper are verified with the case of the space engine development project selection to provide a reference for complex project justification work.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Evaluating the adversarial robustness of detection transformers","authors":"Amirhossein Nazeri, Chunheng Zhao, Pierluigi Pisu","doi":"10.1007/s10489-026-07220-2","DOIUrl":"10.1007/s10489-026-07220-2","url":null,"abstract":"<div>\u0000 \u0000 <p>Robust object detection is critical for autonomous driving and mobile robotics, where accurate recognition of vehicles, pedestrians, and obstacles is essential for ensuring safety. Despite the advancements in detection transformers (DETRs), their robustness against adversarial perturbations remains underexplored. This paper presents a comprehensive evaluation of the DETR model and its variants under both white-box and black-box adversarial attack settings, using the MS-COCO and KITTI datasets to cover general-purpose and autonomous driving scenarios. We adapt a suite of popular white-box attacks in image classifications including FGSM, PGD, C&W and AutoPGD to object detection and assess DETR’s vulnerability under these threats. In addition, we introduce a Modified C&W attack tailored to the DETR architecture that exploits intermediate decoder losses to induce misclassification with minimal perturbations. Our results demonstrate that DETR models are significantly susceptible to the entire range of attacks, and that intra-network transferability between DETR variants is high, whereas cross-network transfer to Faster R-CNN is more limited. We further validate the resilience of Modified C&W against several input purification techniques and visualize self-attention maps to illustrate how adversarial attacks affect the models’ internal representations. These findings reveal critical vulnerabilities in detection transformers under standard and adaptive attacks, underscoring the need for further research to improve the robustness of transformer-based object detectors in safety-critical applications. Codes are available at https://github.com/amirhnazerii/Transformer_ObjDet_Robustness/</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 6","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-026-07220-2.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147738549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}