Expert Systems with Applications最新文献

筛选
英文 中文
Progressive alternating attribute-Structure optimization for multiplex heterogeneous graphs 多路异构图的渐进式交替属性结构优化
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-03 DOI: 10.1016/j.eswa.2026.131495
Haochang Hao , Jun Huang , Shuzhen Rao
{"title":"Progressive alternating attribute-Structure optimization for multiplex heterogeneous graphs","authors":"Haochang Hao ,&nbsp;Jun Huang ,&nbsp;Shuzhen Rao","doi":"10.1016/j.eswa.2026.131495","DOIUrl":"10.1016/j.eswa.2026.131495","url":null,"abstract":"<div><div>Multiplex heterogeneous graphs, characterized by various types of node and relation, often exhibit incomplete structures and missing attributes in real-world scenarios, posing significant challenges for effective representation learning. Although existing studies have explored either structure refinement or attribute completion independently, few have touched on their potential complementarity. In this work, we propose an alternating optimization framework for node representation learning in multiplex heterogeneous graphs. We propose an alternating optimization framework with three key innovations: (i) relation-aware dynamic structure learning guided by attribute similarity, (ii) multi-hop completion of missing attributes on the refined graphs, and (iii) a progressive alternating optimization strategy that couples the two modules so they bootstrap and denoise each other over rounds. Extensive experiments on multiple real-world heterogeneous graph datasets demonstrate that our framework achieves superior performance over state-of-the-art baselines, validating the effectiveness and robustness of progressive structure-attribute co-optimization in heterogeneous graph representation learning.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131495"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A design method for electric vehicle front face styling: based on engineering feasibility optimization of GenAI-generated images 基于genai生成图像工程可行性优化的电动汽车前脸造型设计方法
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131522
Huining Pei , Mingzhe Yang , Zhonghang Bai , Man Ding , Wen Li , Yuxin Cao , Yanjun Zhang
{"title":"A design method for electric vehicle front face styling: based on engineering feasibility optimization of GenAI-generated images","authors":"Huining Pei ,&nbsp;Mingzhe Yang ,&nbsp;Zhonghang Bai ,&nbsp;Man Ding ,&nbsp;Wen Li ,&nbsp;Yuxin Cao ,&nbsp;Yanjun Zhang","doi":"10.1016/j.eswa.2026.131522","DOIUrl":"10.1016/j.eswa.2026.131522","url":null,"abstract":"<div><div>To address the low engineering feasibility of electric vehicle (EV) front face styling images generated by generative artificial intelligence (GenAI) tools such as Midjourney, this study proposes an innovative design method that integrates curve optimization with a collaborative evaluation system combining simulated and human experts. The method aims to enhance the manufacturability of AI-generated design schemes while efficiently transferring the styling genes of conventional fuel vehicles to EV front face styling design. First, the large language model ChatGPT-5.0 is employed to construct a styling semantic database based on six categories of conventional fuel vehicle front face datasets. Second, Midjourney is used to generate an initial EV front face styling dataset, and a production-ready styling dataset is subsequently constructed to provide engineering feasibility references for EV front face styling design. Third, “AI-generated curves” and “engineering reference curves” are fused at different ratios, and an EV front face styling scheme is generated using a curve blending algorithm optimized for the figure–ground relationship. Finally, an LLM-based collaborative evaluation system integrating simulated experts (via ChatGPT-5.0) and human experts is established to conduct quantitative evaluation and optimization of the schemes in terms of engineering feasibility and styling design metrics. A case study demonstrates that the optimized scheme’s engineering feasibility score is significantly improved from 2.3 to 7.1 (out of 10), while maintaining a high level of design creativity (7.5). The established LLM-based collaborative evaluation system achieved high inter-rater consistency in both engineering feasibility evaluation (ICC ≥ 0.9) and design creativity evaluation for EV front face styling schemes (ICC ≥ 0.85), effectively balancing engineering feasibility and design creativity in generative artificial intelligence-generated EV front face styling schemes. By constructing an AI-led, human-supervised hybrid design workflow, this method significantly enhances the engineering feasibility and design efficiency of generative AI in product styling design, providing a theoretical reference for achieving a balance between design innovation and engineering feasibility.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131522"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CNN-DET: A hybrid deep learning architecture for emotion recognition CNN-DET:一种用于情感识别的混合深度学习架构
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-06 DOI: 10.1016/j.eswa.2026.131377
Berrouachedi Abdelkader, Jaziri Rakia, Bernard Gilles
{"title":"CNN-DET: A hybrid deep learning architecture for emotion recognition","authors":"Berrouachedi Abdelkader,&nbsp;Jaziri Rakia,&nbsp;Bernard Gilles","doi":"10.1016/j.eswa.2026.131377","DOIUrl":"10.1016/j.eswa.2026.131377","url":null,"abstract":"<div><div>Emotion recognition plays a crucial role in various biometric applications, including human-computer interaction, healthcare, and security. This paper presents CNN-DET, a novel hybrid approach that integrates Convolutional Neural Networks (CNNs) with Deep Extra-Trees (DETs) for robust facial emotion recognition. The proposed methodology leverages hierarchical feature extraction through pre-trained CNN models combined with ensemble-based classification using DETs to accurately detect and classify emotions from facial expressions. Comprehensive evaluation on benchmark datasets demonstrates the superior performance of our approach. On the FER-2013 dataset, CNN-DET achieves 98.16% accuracy in 10-fold cross-validation and 85.32% accuracy on the standard test set, with precision of 85.7%, recall of 85.3%, and F1-score of 85.4%. The model maintains strong performance across diverse conditions, achieving 91.2% accuracy on AffectNet and 89.7% accuracy on RAF-DB, confirming its generalization capability. Extensive experiments reveal that our method reduces misclassification between visually similar emotions by 23.4% compared to traditional CNN approaches and shows 15.8% improvement in robustness under varying lighting conditions. The proposed approach not only accurately recognizes emotions but also demonstrates consistent performance across different demographic groups, with less than 3.2% performance variance across age and ethnicity subgroups. These findings highlight the significant potential of deep learning techniques for emotion recognition in biometric applications, providing valuable insights for developing more intelligent and interactive systems. Future research will focus on multimodal data fusion and temporal modeling to further enhance recognition accuracy and real-time performance.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131377"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid intelligence–driven global path planning for ships in complex maritime environments 复杂海洋环境下船舶混合智能驱动的全局路径规划
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131473
Jiao Liu , Kaige Zhu , Yuanqiang Zhang , Miao Gao , Pengjun Zheng
{"title":"Hybrid intelligence–driven global path planning for ships in complex maritime environments","authors":"Jiao Liu ,&nbsp;Kaige Zhu ,&nbsp;Yuanqiang Zhang ,&nbsp;Miao Gao ,&nbsp;Pengjun Zheng","doi":"10.1016/j.eswa.2026.131473","DOIUrl":"10.1016/j.eswa.2026.131473","url":null,"abstract":"<div><div>Global ship path planning in complex maritime environments is challenged by dynamic disturbances, vessel-specific constraints, and long-range trajectory dependencies. This study develops an integrated hybrid planning framework that combines deep generative modeling with rule-based optimization. Automatic identification system trajectory time series are first transformed into Gramian Angular Field images to enhance spatio-temporal feature extraction. Vessel type and length are encoded as one-hot vectors and introduced as conditional variables, enabling personalized path generation. These inputs are processed by a Multi-Head Attention–based Conditional Wasserstein Generative Adversarial Network with Gradient Penalty (MHA-cWGAN-GP), in which multi-head attention is used to model long-range dependencies, and conditional Generative Adversarial Network (cGAN) training together with a Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) objective is adopted to improve conditioning behavior and training robustness. The model generates initial navigation paths, which are further refined using an A* search procedure that incorporates wind and current disturbances, as well as constraints such as static obstacles, water depth, and Traffic Separation Scheme (TSS) regulations. The final path is smoothed to ensure feasibility and compliance. In case studies for the Ningbo–Zhoushan Port and Yangtze River Estuary, the hybrid planner reduces the number of search nodes from 45 to 57 to 29–35 while simultaneously enforcing TSS, water-depth, wind, and current constraints, with only about a 3–4% increase in path length relative to classical A* and Dijkstra algorithms. The results indicate that the proposed framework effectively integrates learning and optimization, offering a practical and intelligent solution for real-world maritime path planning.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131473"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146122663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compressive sensing image restoration with deep prior guided group sparse representation 基于深度先验引导群稀疏表示的压缩感知图像恢复
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-04 DOI: 10.1016/j.eswa.2026.131465
Zhulin Ji , Shenghai Liao , Ruyi Han , Shujun Fu
{"title":"Compressive sensing image restoration with deep prior guided group sparse representation","authors":"Zhulin Ji ,&nbsp;Shenghai Liao ,&nbsp;Ruyi Han ,&nbsp;Shujun Fu","doi":"10.1016/j.eswa.2026.131465","DOIUrl":"10.1016/j.eswa.2026.131465","url":null,"abstract":"<div><div>Compressive sensing (CS) enables accurate reconstruction of images from significantly fewer measurements than required by the Nyquist-Shannon sampling theorem, relying critically on effective image priors to regularize the ill-posed inverse problem. Conventional patch-based sparse representation utilize fixed dictionaries that are learned off-the-shelf using the K-SVD algorithm. However, patch-based sparse representation ignores the relationship among patches, and the learned dictionaries can not capture the global image statistics, which will lead to suboptimal reconstruction performance. In this paper, we exploit group sparse representation (GSR) for image compressive sensing reconstruction. By clustering non-local image patches into group and regarding each group as a unit, group sparse representation simultaneously finding sparse codes for all patches within a group, leading to improved reconstruction fidelity and edge preservation. However, GSR relies solely on the undersampled image itself to construct dictionary that is not learnable, being increasingly unreliable at low compressive sensing rates where substantial loss of local image information occurs. To address this limitation, we propose a Deep Prior guided Group Sparse Representation (DPGSR) model for compressive image restoration, where a deep denoiser is responsible for capturing and learning both local and global image statistics by training on external data. The proposed DPGSR achieves improved global consistency, effectively reducing block artifacts while preserving sharper local details. Extensive experiments on image compressive sensing reconstruction and fast MRI demonstrate that the proposed method outperforms state-of-the-art approaches, particularly in preserving fine details and reducing over-smoothing artifacts.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131465"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Surv-RWKV: Cross-modal receptance weighted key-value interaction with optimal transport feature alignment for survival analysis Surv-RWKV:跨模态接受加权键值相互作用与最佳运输特征对齐的生存分析
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-04 DOI: 10.1016/j.eswa.2026.131506
Xiyang Kuang , Bin Yang , Bingo Wing-Kuen Ling , Kok Lay Teo , Xiaozhi Zhang
{"title":"Surv-RWKV: Cross-modal receptance weighted key-value interaction with optimal transport feature alignment for survival analysis","authors":"Xiyang Kuang ,&nbsp;Bin Yang ,&nbsp;Bingo Wing-Kuen Ling ,&nbsp;Kok Lay Teo ,&nbsp;Xiaozhi Zhang","doi":"10.1016/j.eswa.2026.131506","DOIUrl":"10.1016/j.eswa.2026.131506","url":null,"abstract":"<div><div>Multimodal learning has played a pivotal role in survival prediction, particularly in integrating pathological images and genomic data for improving predictive performance. Pathological images provide macroscopic histological information about tumor morphology, while genomic data reveal molecular-level genetic characteristics. The integration of these two modalities enables a comprehensive characterization of tumor heterogeneity and disease progression mechanisms. Despite recent advances in multimodal integration that have significantly enhanced prognostic accuracy, challenges remain in effectively analyzing high-dimensional and heterogeneous whole-slide images and omics data. Current Transformer-based sequence modeling approaches suffer from limited computational efficiency when processing long feature sequences and capturing complex cross-modal interactions. To address these challenges, we propose an innovative cross-modal receptance weighted key-value (RWKV)-based framework, termed Surv-RWKV, for survival prediction. This framework integrates RWKV-based sequence modeling with advanced multimodal fusion strategies to enhance both predictive accuracy and model efficiency. Specifically, Surv-RWKV employs parallel RWKV-based encoders to model long-range dependencies in WSI tissue cluster patterns and genomic pathway activation profiles, achieving improved prognostic performance with optimized computational efficiency. Subsequently, a transport-based optimal cross-modal alignment module is introduced to establish semantic correspondences between histopathological and genomic feature spaces. Furthermore, a progressive feature fusion strategy is implemented to enable effective cross-modal interaction. An RWKV-based shallow fusion module is first developed to explore cross-modal dependencies through spatial-channel hybrid operations, thereby enhancing the representational quality of fused features. A cross-RWKV deep interaction module is then designed to further strengthen information synthesis via iterative cross-attention mechanisms, while simultaneously reinforcing intra-modal representation learning and cross-modal knowledge transfer. Surv-RWKV is expected to effectively capture such cross-modal correlations, thereby improving the accuracy and interpretability of survival predictions. Extensive validation across five TCGA cancer cohorts demonstrates that Surv-RWKV achieves state-of-the-art predictive performance with superior computational efficiency.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131506"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HHGDroid: Hybrid heterogeneous graph-based android malware detection via multi-evidence similarity fusion HHGDroid:基于多证据相似性融合的基于混合异构图形的android恶意软件检测
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-04 DOI: 10.1016/j.eswa.2026.131528
Junwei Tang , Xiaomei Tian , Tao Peng , Jianfeng Lu , Haozhao Wang , Ruixuan Li
{"title":"HHGDroid: Hybrid heterogeneous graph-based android malware detection via multi-evidence similarity fusion","authors":"Junwei Tang ,&nbsp;Xiaomei Tian ,&nbsp;Tao Peng ,&nbsp;Jianfeng Lu ,&nbsp;Haozhao Wang ,&nbsp;Ruixuan Li","doi":"10.1016/j.eswa.2026.131528","DOIUrl":"10.1016/j.eswa.2026.131528","url":null,"abstract":"<div><div>Currently, static analysis is insufficient to deal with Android malware that employs advanced evasion techniques such as code obfuscation and dynamic loading. Therefore, hybrid analysis that combines static structure and dynamic behavior has become the mainstream trend. However, existing hybrid analysis methods often adopt simple feature concatenation or shallow fusion mechanisms, which cannot effectively integrate heterogeneous static and dynamic features or capture the complex correlations between structure and behavior. To address this, we propose a hybrid heterogeneous graph-based Android malware detection method via multi-evidence similarity fusion, named HHGDroid. The function call graph generated by static analysis and the event graph obtained through dynamic analysis are connected through a comprehensive similarity of multiple evidences such as semantics, permissions, and time frequency, ultimately forming the hybrid heterogeneous graph with multiple heterogeneous nodes and edges. Our constructed hybrid heterogeneous graph is the first one that simultaneously possesses static and dynamic features. Finally, we improve Reliability-Calibrated Heterogeneous Graph Transformer (RCHGT) to learn the multiple relationships in the hybrid heterogeneous graph, which can automatically distinguish reliable and unreliable edges during the information propagation stage. We conduct experiments on real Android malware applications and achieved an F1-score of 97.87%, outperforming the state-of-the-art methods. Additionally, we verify our method on an unknown malware dataset and obtained an F1-score of 81.52%, which is superior to existing methods. HHGDroid is a novel and effective method for detecting Android malware.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131528"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated self-Expanding neural network learning framework for heterogeneous devices 异构设备的联邦自扩展神经网络学习框架
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-15 Epub Date: 2026-01-21 DOI: 10.1016/j.eswa.2026.131199
Rong Xie , Zhong Chen , Weiguo Cao , Haosen Wang
{"title":"Federated self-Expanding neural network learning framework for heterogeneous devices","authors":"Rong Xie ,&nbsp;Zhong Chen ,&nbsp;Weiguo Cao ,&nbsp;Haosen Wang","doi":"10.1016/j.eswa.2026.131199","DOIUrl":"10.1016/j.eswa.2026.131199","url":null,"abstract":"<div><div>Federated learning enables collaborative training without sharing raw data, while addressing growing privacy concerns. Real deployments face wide device heterogeneity that undermines both efficiency and accuracy in multi sensor information fusion. We present FSENNL, a federated framework with a self expanding neural network that adapts model capacity to each device. It adjusts capacity dynamically while leaving communication unchanged. A natural extension score combines Fisher information with device profiles to decide when and where to expand. An adaptive regularization term stabilizes newly added units and prevents over extension. To align structurally diverse models during aggregation, an adaptive pruning compensation step uses Optimal Brain Surgeon with lightweight compensation data to recover accuracy after alignment. Knowledge distillation with an asynchronous fusion protocol mitigates straggler effects from uneven training speeds. Decoupling update frequency through teacher and student roles supports timely aggregation and cross device knowledge transfer while preserving convergence. Experiments across heterogeneous settings show consistent accuracy with improved resource use, and demonstrate that the method scales to large federations. FSENNL provides a practical solution for multi sensor information fusion in federated systems, delivering scalable and efficient models under diverse computational constraints.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"311 ","pages":"Article 131199"},"PeriodicalIF":7.5,"publicationDate":"2026-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-space intervention for mitigating bias in robust visual question answering 双空间干预在稳健视觉问答中的缓解偏差
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-15 Epub Date: 2026-01-27 DOI: 10.1016/j.eswa.2026.131346
Runmin Wang , Xingdong Song , Zukun Wan , Han Xu , Congzhen Yu , Tianming Ma , Yajun Ding , Shengyou Qian
{"title":"Dual-space intervention for mitigating bias in robust visual question answering","authors":"Runmin Wang ,&nbsp;Xingdong Song ,&nbsp;Zukun Wan ,&nbsp;Han Xu ,&nbsp;Congzhen Yu ,&nbsp;Tianming Ma ,&nbsp;Yajun Ding ,&nbsp;Shengyou Qian","doi":"10.1016/j.eswa.2026.131346","DOIUrl":"10.1016/j.eswa.2026.131346","url":null,"abstract":"<div><div>Visual Question Answering (VQA) evaluates the visual-textual reasoning capabilities of intelligent agents. However, existing methods are often susceptible to various biases. In particular, language bias leads models to rely on spurious question-answer correlations as shortcut solutions, while distribution bias caused by dataset imbalance encourages models to overfit head classes and overlook tail classes. To address these long-standing challenges, we propose a Dual-Space Intervention (DSI) approach that tackles these two biases from a unified yet complementary perspective. Two key innovations are included in our work: (1) In the input space, we adopt an adaptive question shuffling strategy to alleviate language bias by adjusting perturbation strength according to question bias, ensuring models develop a deeper understanding of the problem context, rather than relying on spurious word-answer correlations; (2) In the output space, we propose a novel label rebalancing mechanism that moderates head-class dominance based on long-tailed statistics, improving robustness to distribution bias. This approach reduces the disproportionately high variance in head logits relative to tail logits, improving tail class recognition accuracy. Extensive experiments on four benchmarks (VQA-CP v1, VQA-CP v2, VQA-CE, and SLAKE-CP) demonstrate our method’s superiority, with VQA-CP v1 and SLAKE-CP achieving state-of-the-art performance at 63.14% and 37.61% respectively. The code will be released at <span><span>https://github.com/songxdr3/DSI</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"311 ","pages":"Article 131346"},"PeriodicalIF":7.5,"publicationDate":"2026-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCCL: Question-guided dual-channel contrastive learning framework for emotion-cause pair extraction 基于问题导向的双通道情感原因对提取对比学习框架
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-15 Epub Date: 2026-01-29 DOI: 10.1016/j.eswa.2026.131357
Hongyang Wang, Yajun Du, Jia Liu, Xianyong Li, Xiaoliang Chen, Yanli Lee, Qing Qi, Wanjie Zhang
{"title":"DCCL: Question-guided dual-channel contrastive learning framework for emotion-cause pair extraction","authors":"Hongyang Wang,&nbsp;Yajun Du,&nbsp;Jia Liu,&nbsp;Xianyong Li,&nbsp;Xiaoliang Chen,&nbsp;Yanli Lee,&nbsp;Qing Qi,&nbsp;Wanjie Zhang","doi":"10.1016/j.eswa.2026.131357","DOIUrl":"10.1016/j.eswa.2026.131357","url":null,"abstract":"<div><div>The emotion-cause pair extraction (ECPE) task aims to identify emotion clauses and their corresponding cause clauses from document-level text. It has important applications in a wide range of scenarios, including public opinion monitoring and user feedback analysis. Although research has made initial progress on this task, existing methods still face challenges in identifying implicit emotions. Firstly, the lack of explicit semantic guidance leads to insufficient discriminative power, especially when dealing with ambiguous emotional expressions. Secondly, existing methods primarily focus on modeling intra-sentence relationships, which limits their ability to jointly capture cross-sentence temporal dependencies and global semantic information. To address the challenges of emotion-cause pair extraction, we propose a question-guided dual-channel contrastive learning framework, DCCL. Firstly, the DCCL employs a question formulation based on machine reading comprehension (MRC) to guide the model in capturing the emotion-cause relationship between clauses. Furthermore, task-specific queries are explicitly injected into the input, making the model more aware of the task objective. Secondly, in DCCL, we design a dual-channel network combining query-aware clause-level Transformer and BiLSTM to enhance the model’s ability to capture temporal and global contextual dependencies, which enables DCCL to capture the temporal and global contextual relationships between clauses more fully. Thirdly, the DCCL incorporates supervised contrastive learning. We leverage positive and negative samples to incorporate contrastive learning into each channel, which optimizes the representation space and enhances the model’s ability to recognize ambiguous emotions and boundary conditions. We conducted experiments on three mainstream tasks, namely emotion cause pair extraction, emotion extraction, and cause extraction, on the ECPE benchmark dataset. The results show that DCCL improves the F1 scores of the best baseline models such as CD-MRC, SEG, ect by 1.53%, 4.41%, respectively in the emotion-cause pair extraction task, 0.81%, 4.37%, respectively in the emotion extraction task, and 0.62%, 1.27%, respectively in the cause extraction task. Moreover, compared with the large language model baseline LLM-MTLN, DCCL further improves F1 by 2.48%, 4.50%, and 0.63% on these three tasks, respectively.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"311 ","pages":"Article 131357"},"PeriodicalIF":7.5,"publicationDate":"2026-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146070882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书