Ying Huang , Xiaojian Cao , Benben Zhou , Wei Li , Shuling Yang , S.M. Shafi , Zhou Yang
{"title":"A deep reinforcement learning-guided multimodal multi-objective evolutionary algorithm with a serial-parallel mechanism","authors":"Ying Huang , Xiaojian Cao , Benben Zhou , Wei Li , Shuling Yang , S.M. Shafi , Zhou Yang","doi":"10.1016/j.eswa.2025.129581","DOIUrl":"10.1016/j.eswa.2025.129581","url":null,"abstract":"<div><div>The core challenge for multimodal multi-objective problem (MMOP) resolution lies in maintaining synergistic interactions between convergence and diversity. However, the existing algorithms usually consider convergence-first, which neglect to consider both diversity and convergence into account during the evolutionary process. Likewise, the optimization methods tend to gravitate toward locally optimal regions rapidly, leading to lose diversity for the local PS. This paper proposes a Deep Reinforcement Learning-guided multimodal multi-objective evolutionary algorithm with a serial-parallel mechanism (DRLMMEA) to investigate the impact of different operator selection on the performance of MMEAs, which greatly helps to balance the convergence and diversity. DRLMMEA utilizes Q-Network to select the operator with the highest reward to enhance the population’s search ability. An improved sorting method (ISM) based on neighborhood dominance updates the population by sorting individuals according to their convergence quality, thereby enhancing convergence performance in the objective space. Moreover, this study proposes a series-parallel mechanism, a series structure enhances the diversity in the decision space, while the parallel structure reduces the computational burden of the algorithm evidently. The proposed Deep Reinforcement Learning-assisted operator selection mechanism, which enables effective balance between diversity and convergence, and an improved crowding distance approach that enhances convergence performance. DRLMMEA undergoes comprehensive testing against 6 contemporary approaches using MMF and IDMP benchmark problems, achieving supremacy in 4 principal performance metrics according to experimental findings. The multimodal gearbox parameter optimization is addressed using the proposed DRLMMEA, which demonstrates superior performance against 6 algorithms in comparative evaluations. It has demonstrated a significant role in solving the MMOPs with the imbalance between convergence and diversity.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129581"},"PeriodicalIF":7.5,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LDATA-Net: Dynamic feature adaptation for efficient feature learning in resource-limited UAV detection","authors":"Shuming Lin, Sang Feng, Junnan Tan","doi":"10.1016/j.eswa.2025.129725","DOIUrl":"10.1016/j.eswa.2025.129725","url":null,"abstract":"<div><div>Unmanned Aerial Vehicle (UAV) image analysis faces the dual challenges of complex background interference and limited onboard computational resources, particularly when processing extreme scale variations across multiple viewpoints. Existing approaches typically enhance detection accuracy by increasing model complexity, but this often leads to parameter proliferation that exceeds the deployment limits of airborne platforms. To address this fundamental contradiction, we propose LDATA-Net (Lightweight Dynamic Aggregation Task-Aligned Network), which pioneers a “Dynamic Feature Adaptation” design paradigm aimed at achieving synergistic optimization between parameter efficiency and detection accuracy. This framework systematically realizes end-to-end dynamic adaptive capabilities through three core components that operate collaboratively across feature extraction, fusion, and detection stages: (1) Dynamic Multi-Branch Depthwise Block (DMBD-Block), whose core innovation is our proposed novel operator DIDWConv, which adaptively adjusts receptive fields according to input features to capture targets of extreme scales and orientations; (2) Lightweight Dynamic Aggregation Network (LDANet), which effectively preserves critical spatial contextual information through hierarchical fusion architecture and dynamic weighting mechanisms; (3) Dynamic Adaptive Head (DA-Head), which effectively mitigates task conflicts through geometric and semantic dynamic feature alignment. LDATA-Net achieves 35.4 %, 77.9 %, and 51.2 % AP<span><math><msub><mrow></mrow><mn>50</mn></msub></math></span> on VisDrone2019, DOTA1.0, and AI-TODv2 datasets respectively with only 2.8M parameters, establishing a new paradigm for designing memory-efficient yet high-performance detection systems, particularly for resource-constrained heterogeneous computing aviation platforms.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129725"},"PeriodicalIF":7.5,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yimin Ou , Yifan Wang , Ping Jian , Tianhe Zhang , Xing Pei
{"title":"MCAD-EUC: Multi-context adaptive decoding with entropy-based uncertainty calibration for knowledge conflict mitigation","authors":"Yimin Ou , Yifan Wang , Ping Jian , Tianhe Zhang , Xing Pei","doi":"10.1016/j.eswa.2025.129659","DOIUrl":"10.1016/j.eswa.2025.129659","url":null,"abstract":"<div><div>The knowledge sources of large language models (LLMs) encompass both parametric internal knowledge and external contextual information. However, conflicts between these two sources can significantly impair model performance. Existing methods typically assume a priori correctness of either the context or the parametric knowledge, lacking dynamic coordination mechanisms and being limited to single-context scenarios. To address this issue, this work proposes a lightweight and training-free decoding method, <strong>M</strong>ulti-<strong>C</strong>ontext <strong>A</strong>daptive <strong>D</strong>ecoding (<strong>MCAD-EUC</strong>), which dynamically measures the effectiveness of both knowledge through <strong>E</strong>ntropy based <strong>U</strong>ncertainty <strong>C</strong>alibration. It does not concern itself with whether the knowledge is false or true, the internal or the external, but balancing them according to their contributions to correctly answering the question. Particularly, MCAD-EUC is naturally multi-contextual. It can dynamically amplify the distribution of golden context while mitigating the influence of noisy context, thereby optimizing the final logits for predicting the next token during the decoding process. To comprehensively evaluate the model performance in multi-context scenarios, this work constructs MCQA, a multi-context question answering dataset that includes golden context, irrelevant context, and six categories of misleading context (crowd, logic, temporal, authority, emotional, numeric), simulating the diversity of noise in real-world settings. Extensive experiments on four LLMs and four MCQA datasets demonstrate that MCAD-EUC achieves an average accuracy improvement of 3.17 % over the best-performing baseline methods. Further sensitivity analysis confirms that the entropy-based adaptive weighting mechanism consistently outperforms all fixed-weight settings. Our dataset and code will be publicly available.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129659"},"PeriodicalIF":7.5,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145189674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yinuo Li , Jin-Kao Hao , Kwong Meng Teo , Liwei Song
{"title":"Heterogeneous cloud resource allocation: a case study on real-time transcoding in live streaming","authors":"Yinuo Li , Jin-Kao Hao , Kwong Meng Teo , Liwei Song","doi":"10.1016/j.eswa.2025.129700","DOIUrl":"10.1016/j.eswa.2025.129700","url":null,"abstract":"<div><div>The explosion in popularity of crowdsourced live streaming (CLS) has led to a huge increase in demand for cloud resources to support real-time video transcoding. CLS transcoding is real-time, geographically distributed and computationally intensive. Therefore, transcoding service providers need to cost-effectively utilize diverse heterogeneous cloud resources, while guaranteeing quality of service standards to ensure a good streaming experience for the viewers. To support the above, we developed a novel proactive-reactive resource allocation framework that optimizes the overall cost of supporting the CLS transcoding service using heterogeneous edge and cloud computing resources. The offline proactive policy evaluator aims to provide a good and adaptable resource usage plan in advance, matching the predicted demand with the heterogeneous resources. The reactive execution module monitors the actual demand online and controls the resource usage to compensate for deviations from the offline prediction. Our experiments show that the proposed approach leads to a cost reduction of 42 % compared to the fixed usage ratio strategy based on expert knowledge.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129700"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145107725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yikang Shi , Xin Zhan , Yaqian Li , Zhongqiang Wu , Wenming Zhang , Haibin Li
{"title":"Cycle-CFM: An unsupervised framework for robust multimodal anomaly detection in industrial settings","authors":"Yikang Shi , Xin Zhan , Yaqian Li , Zhongqiang Wu , Wenming Zhang , Haibin Li","doi":"10.1016/j.eswa.2025.129745","DOIUrl":"10.1016/j.eswa.2025.129745","url":null,"abstract":"<div><div>Industrial multimodal anomaly detection is confronted with three pivotal challenges: cross-modal feature drift, noise sensitivity, and modality imbalance. To address these issues, we propose Cycle-Consistent Cross-Modal Feature Mapping (Cycle-CFM), an unsupervised framework that integrates cycle-consistent cross-modal mapping with channel-attention-guided adaptive loss weighting. Cycle-CFM establishes bidirectional feature alignment between RGB and 3D modalities via reversible cycle mappings, yielding consistent representations robust to vibration and depth noise. To further mitigate dynamic interferences such as illumination variations, we introduce a joint optimization strategy that combines cross-consistency and cycle-consistency losses. Experimental results on our self-constructed <strong>SteelDefect-3D-AD</strong> dataset demonstrate that Cycle-CFM achieves an <strong>AUPRO@1 %</strong> of 0.371, outperforming state-of-the-art methods by 17–45 %. It also attains a pixel-level AUROC (P-AUROC) of 0.991 and an image-level AUROC (I-AUROC) of 0.998. On the public <strong>MVTec 3D-AD</strong> benchmark, Cycle-CFM reaches a mean P-AUROC of 0.960 and improves accuracy by 37.5 % for elongated anomalies. With a runtime of 11.03 FPS and 469.52 MB of parameters, the model highlights both its effectiveness and deployability for real-time industrial inspection.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129745"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing traffic signal control through model-based reinforcement learning and policy reuse","authors":"Yihong Li, Chengwei Zhang, Furui Zhan, Wanting Liu, Kailing Zhou, Longji Zheng","doi":"10.1016/j.eswa.2025.129755","DOIUrl":"10.1016/j.eswa.2025.129755","url":null,"abstract":"<div><div>Multi-agent reinforcement learning (MARL) has shown significant potential in traffic signal control (TSC). However, current MARL-based methods often suffer from insufficient generalization due to the fixed traffic patterns and conditions of the road network used during training. This limitation results in poor adaptability to new traffic scenarios, leading to high retraining costs and complex deployment. To address this challenge, we propose two algorithms: PLight and PRLight. PLight employs a model-based reinforcement learning approach, pretraining control policies, and environment models using predefined source-domain traffic scenarios. The environmental model predicts state transitions, facilitating the comparison of environmental characteristics. PRLight further enhances adaptability by adaptively selecting pre-trained PLight agents based on the similarity between the source and target domains to accelerate the learning process in the target domain. We evaluated the algorithms through two transfer settings: (1) adaptability to different traffic scenarios within the same road network, and (2) generalization across different road networks. The results show that PRLight significantly reduces the adaptation time compared to learning from scratch in new TSC scenarios, achieving optimal performance using similarities between available and target scenarios.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129755"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145118825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chenglong Li , Xirong Ma , Xiuhao Wang , Fanyu Kong , Yunting Tao , Chunpeng Ge
{"title":"Optimized homomorphic linear computation in privacy-preserving CNN inference","authors":"Chenglong Li , Xirong Ma , Xiuhao Wang , Fanyu Kong , Yunting Tao , Chunpeng Ge","doi":"10.1016/j.eswa.2025.129767","DOIUrl":"10.1016/j.eswa.2025.129767","url":null,"abstract":"<div><div>Machine Learning as a Service (MLaaS) provides robust solutions for deploying deep learning inference in cloud environments. However, it also raises serious privacy concerns regarding user data and proprietary model parameters. Numerous hybrid cryptographic protocols that integrate homomorphic encryption (HE) and garbled circuits (GC) have been proposed to enable secure inference with low latency. In these protocols, the homomorphic evaluation of linear operations remains the primary performance bottleneck and warrants further optimization. In this work, we propose novel optimizations for HE-based linear computations within the hybrid cryptographic framework for secure neural network inference. Specifically, we devise two efficient strategies for homomorphic matrix-vector multiplication and convolution. For matrix-vector multiplication, we introduce a grouped diagonal extraction technique that encodes the weight matrix more compactly and enables configurable ciphertext rotation reuse, while for homomorphic convolution, we present a group-wise combine-and-merge evaluation method. Both methods significantly reduce the number of required ciphertext rotations. Our approach achieves up to a <span><math><mrow><mn>3.9</mn><mo>×</mo></mrow></math></span> speedup in matrix-vector multiplication and a <span><math><mrow><mn>2.9</mn><mo>×</mo></mrow></math></span> improvement in convolution over state-of-the-art (SOTA) solutions. The HE-GC hybrid secure convolutional neural networks (CNN) inference framework incorporating these enhancements yields speedups of <span><math><mrow><mn>2.5</mn><mo>×</mo></mrow></math></span> on widely used ResNets deep learning architectures.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129767"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145158466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-objective portfolio optimization for stock return prediction using machine learning","authors":"Meiyu Huang , Shili Dang , Miraj Ahmed Bhuiyan","doi":"10.1016/j.eswa.2025.129672","DOIUrl":"10.1016/j.eswa.2025.129672","url":null,"abstract":"<div><div>This paper presents a novel approach that integrates stock return prediction with the mean–variance (MV) model to enhance the performance of the original model. Firstly, stock returns are predicted using machine learning algorithms, including Robust Linear Regression (OLS-H), Random Forest (RF), and Long Short-Term Memory Networks (LSTM), to select a pre-screened stock pool composed of stocks with high predicted returns. Secondly, a linear weighting method combines the predictions above with the MV model, constructing the Mean-Variance-Forecast Error (MVF) model and determining the investment proportions for the pre-selected stocks. Finally, empirical research is conducted using the components of the CSI 300 Index as sample data. The results indicate that the RF + MVF model outperforms other models and the CSI 300 Index in return and risk metrics. At the same time, a sensitivity analysis of relevant parameters further confirms that considering return uncertainty is beneficial for improving the out-of-sample performance of the MV model.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129672"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145109840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxiao Zhang , Jin Wang , Yang Zhou , Senyun Jia , Zhi Zheng , Dongliang Zhang , Guodong Lu
{"title":"3D modeling from a single sketch with multifaceted semantic understanding","authors":"Yuxiao Zhang , Jin Wang , Yang Zhou , Senyun Jia , Zhi Zheng , Dongliang Zhang , Guodong Lu","doi":"10.1016/j.eswa.2025.129748","DOIUrl":"10.1016/j.eswa.2025.129748","url":null,"abstract":"<div><div>This paper studies the problem of 3D shape generation from a single sketch. Prior works rely on directly extracted visual features of sketches as guidance for the generation process. However, the sparse visual cues and abstract nature of sketches, which are inherited in the guiding features, lead to semantic ambiguity and geometry incompleteness in the generated shapes, compromising accuracy. To address this, we propose MSU-3D, a diffusion-based framework for sketch-to-3D generation, leveraging <em>Multifaceted Semantic Understanding</em> to explicitly analyze the construction information of sketches from multiple facets before providing fine-grained guidance over 3D shape generation. Specifically, we decompose sketches through three interpretative facets (semantics, depth, and normal), introducing reasoning of three representations to capture 3D features from distinct perspectives: local components, basic 3D geometry, and 3D surface details. One step further, we propose a multifaceted perception module. It aggregates multifaceted feature representations and leverages local component features as a two-pronged guiding representation to jointly guide the perception of basic shapes and surface details. To ensure fine-grained control, the hierarchical perception strategy adaptively injects varying granularity of perception features at different stages of the 3D generation. Extensive experiments and comparisons with state-of-the-art methods on various complex posture datasets validate the effectiveness of our framework in mitigating semantic ambiguity and geometry incompleteness in 3D generation.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129748"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Homophone-aware offensive language detection via semantic-phonetic collaboration","authors":"Jiahao Hu, Shanliang Pan","doi":"10.1016/j.eswa.2025.129756","DOIUrl":"10.1016/j.eswa.2025.129756","url":null,"abstract":"<div><div>The increasing use of implicit and obfuscated expressions poses significant challenges to offensive language detection in Chinese online platforms. In particular, users often exploit homophone substitutions to bypass keyword-based moderation, making traditional detection systems inadequate. This study addresses the problem of detecting offensive content masked through homophonic substitutions, which retain aggressive intent while altering character representations. Existing methods fall into two main categories: (1) semantic-only models, which struggle with phonetic manipulations due to their reliance on text features alone, and (2) auxiliary-enhanced models, which incorporate phonetic or syntactic signals but lack deep integration between modalities. To overcome these limitations, we propose a lightweight dual-branch model that separately encodes textual semantics and pinyin phonetics under a multi-view learning framework. A Dual-Branch Interactive Training strategy is introduced to enable dynamic cross-modal alignment via contrastive objectives, allowing each modality to mutually refine the other and enhance robustness to adversarial inputs. We conduct experiments on two benchmark datasets, COLD and SWSR, both of which are augmented with varying levels of homophone noise to simulate real-world evasion strategies. The proposed model outperforms all baseline models, achieving an average F1-score improvement of 6.3 % under high-noise conditions, while reducing inference latency and memory usage by more than 60 %, demonstrating both effectiveness and efficiency for real-time deployment. We will release the source code for further use by the community<span><span>https://github.com/hjhhlc/DBIT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129756"},"PeriodicalIF":7.5,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145159226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}