Expert Systems with Applications最新文献

筛选
英文 中文
Investigating spatial-temporal bias of LLMs 法学硕士的时空偏差研究
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131542
Zijun Li
{"title":"Investigating spatial-temporal bias of LLMs","authors":"Zijun Li","doi":"10.1016/j.eswa.2026.131542","DOIUrl":"10.1016/j.eswa.2026.131542","url":null,"abstract":"<div><div>Large Language Models (LLMs) are emerging as powerful knowledge and expert systems with notable capabilities in understanding and inferring various intelligent tasks. However, their spatiotemporal cognition biases remain largely underexplored, despite being highly consequential for effectively leveraging LLMs to power diverse applications in understanding, explaining, and forecasting such tasks. In light of this, this paper presents an investigation of the presence and patterns of spatiotemporal bias in LLMs. Specifically, this paper first constructs two datasets from the perspectives of economic and social forecasting, each paired with corresponding model-predicted values for the same spatiotemporal scope across four different LLMs. Then, a novel autocorrelation measurement approach is introduced, alongside a set of quantification methods, to jointly evaluate correlation in biases across both space and time. The results show notable variation in performance and bias across models and tasks, with uncommon and more sensitive tasks exhibiting worse performance, and certain LLMs producing regionally clustered errors while others exhibit near-random distributions. Out of all other methods of changing prompts, incorporating temporal context significantly improves predictive accuracy, particularly for volatile or low-frequency events. Overall, these findings highlight the partial but inconsistent internalization of real-world spatiotemporal patterns in LLMs, and the proposed methods provide tools for quantifying and interpreting spatiotemporal bias, thereby offering guidance for designing fairer and more reliable LLM-based expert systems and applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131542"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On ground track design of unmanned fixed-wing drone aided relaying in windy environments 多风环境下无人固定翼无人机辅助接力地面轨道设计
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131539
Xuan Zhu , Xiaodong Ji , Ansheng Yin
{"title":"On ground track design of unmanned fixed-wing drone aided relaying in windy environments","authors":"Xuan Zhu ,&nbsp;Xiaodong Ji ,&nbsp;Ansheng Yin","doi":"10.1016/j.eswa.2026.131539","DOIUrl":"10.1016/j.eswa.2026.131539","url":null,"abstract":"<div><div>This article studies an unmanned fixed-wing drone (UFD) aided relaying in windy environments, where the UFD serves as a full-duplex amplify-and-forward relay to forward a desired size of data for two ground terminals. In light of aerodynamics and the wind triangle, the UFD’s engine power required for flying at a constant airspeed along a circular ground track in a three dimensional uniform wind is analyzed, giving a corresponding closed-form expression. It is shown that the UFD’s engine power depends upon its airspeed and bank angle in addition to the wind-speed and the corresponding vertical angle. On this basis, an optimization problem corresponding to the UFD’s ground track design is investigated. Using the block coordinate descent technique, the initial problem is decomposed into two sub-problems, which are addressed by four algorithms (Algorithms 1–4). This leads to an iterative algorithm (Algorithm 5) that optimizes the UFD’s airspeed and adjusts its flight parameters (e.g., time, radius, and the angles of pitch, course, crab, heading, and bank) to follow the desired ground track. Computer simulation results verified that the proposed algorithm achieves the best energy-saving performance, and generates a small bank angle with minimal variation during flight. This characteristic alleviates the demand for fast bank angle command following when adjusting the UFD’s flight parameters in windy environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131539"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved visibility for monocular camera-based perception in autonomous driving systems under rain with fog: An efficient vision transformer approach 雨雾条件下自动驾驶系统中基于单目摄像头感知的可视性改进:一种有效的视觉转换方法
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2025-12-04 DOI: 10.1016/j.eswa.2025.130661
Yao-Jiun Huang , Kang Li
{"title":"Improved visibility for monocular camera-based perception in autonomous driving systems under rain with fog: An efficient vision transformer approach","authors":"Yao-Jiun Huang ,&nbsp;Kang Li","doi":"10.1016/j.eswa.2025.130661","DOIUrl":"10.1016/j.eswa.2025.130661","url":null,"abstract":"<div><div>Adverse weather conditions degrade image quality and reduce the reliability of monocular camera-based perception systems in autonomous driving. Most existing deraining methods focus on removing rain streaks while overlooking fog effects, which further obscure scene details and hinder downstream perception tasks. This paper presents an efficient Vision Transformer (ViT)-based restoration framework that addresses both rain streaks and fog without compromising perception accuracy. The method incorporates a Depth-Guided Spatial Feature Transform (DG-SFT) block, which leverages depth information predicted by a lightweight CNN-based decoder. The DG-SFT is designed based on a mathematical rain model to effectively remove rain and distance-dependent haze. A semantic loss function is introduced to constrain the segmentation output discrepancy between original and restored images to within 4 %. Experiments on the RainCityscapes dataset and real-world rainy images demonstrate improvements in PSNR and SSIM over existing ViT- and CNN-based approaches, with an inference latency of 7.86 ms, supporting its deployment in latency-critical autonomous vehicle platforms.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 130661"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three decades of differential evolution: a bibliometric analysis (1995-2025) 三十年的差异演化:文献计量学分析(1995-2025)
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-12 DOI: 10.1016/j.eswa.2026.131451
Pooja Verma , Reshu Chaudhary , Hina Gupta , Rohit Salgotra
{"title":"Three decades of differential evolution: a bibliometric analysis (1995-2025)","authors":"Pooja Verma ,&nbsp;Reshu Chaudhary ,&nbsp;Hina Gupta ,&nbsp;Rohit Salgotra","doi":"10.1016/j.eswa.2026.131451","DOIUrl":"10.1016/j.eswa.2026.131451","url":null,"abstract":"<div><div>Since its introduction in 1995, Differential Evolution (DE) has emerged as a foundational algorithm in the domain of computational intelligence and metaheuristic optimization. This paper presents a comprehensive bibliometric and thematic review of DE research over three decades (1995–2025), based on 9,900+ publications retrieved from Scopus, Web of Science (WoS), and IEEE Xplore. By using advanced visualization tools, including Sankey diagrams (to trace institution-country-keyword flows), Choropleth maps (to reveal global research distribution), citation and co-authorship networks, and heatmaps (to assess cross-domain influence), the review uncovers major contributors, thematic concentrations, and emerging frontiers. The analysis spans publication trajectories, prolific authors and institutions, core research directions, and domain-specific applications in 12 prominent fields such as engineering optimization, artificial intelligence, bioinformatics, energy systems, and control systems. The study emphasizes the evolution of DE and its increasing interdisciplinary integration, and the growing dominance of Asia, particularly China, India, and Iran, as key centers of DE innovation. Through a synthesis of keyword co-occurrence, collaborative clustering, and citation dynamics, this review maps the landscape of DE research and outlines pressing challenges and promising avenues for future inquiry.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131451"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sureillance camera authentication system based on PRNU 基于PRNU的监控摄像头认证系统
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131548
Jian Li , Lisheng Yan , Bin Ma , Xiaolong Li , Zhenxing Qian
{"title":"Sureillance camera authentication system based on PRNU","authors":"Jian Li ,&nbsp;Lisheng Yan ,&nbsp;Bin Ma ,&nbsp;Xiaolong Li ,&nbsp;Zhenxing Qian","doi":"10.1016/j.eswa.2026.131548","DOIUrl":"10.1016/j.eswa.2026.131548","url":null,"abstract":"<div><div>In this paper, we propose a camera authentication scheme to enhance access security for front-end devices in surveillance networks. The scheme leverages the Photo-Response Non-Uniformity (PRNU) pattern noise of camera sensors and combines traditional encryption techniques to strengthen system security. During registration, the camera captures images to extract PRNU, generates a compressed device fingerprint stored on the server as a root key. For authentication, the server sends a challenge sequence randomly generated from the root key to the front-end, which captures a new image to generate a root key approximation for response. To prevent attackers from extracting device fingerprints from public images, it incorporates anonymization, proposing a DWT-based PRNU anonymization algorithm. This improves PSNR by 8.08 dB and SSIM by 0.08 on average compared to previous methods. Security Analysis and Experimental results show high authentication accuracy and security, effectively resisting replay and man-in-the-middle attacks, providing a robust solution for surveillance network devices.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131548"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object search strategy for service robots with knowledge-based viewpoint selection and hierarchical action decisions 基于知识的视点选择和分层行动决策的服务机器人目标搜索策略
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131538
Yuhao Wang, Guohui Tian
{"title":"Object search strategy for service robots with knowledge-based viewpoint selection and hierarchical action decisions","authors":"Yuhao Wang,&nbsp;Guohui Tian","doi":"10.1016/j.eswa.2026.131538","DOIUrl":"10.1016/j.eswa.2026.131538","url":null,"abstract":"<div><div>Service robots are frequently tasked with searching for target objects relevant to specific operations. However, the dynamic nature of object locations poses significant challenges for precise localization and tracking. To address this, we propose a unified framework for efficient object search and navigation that integrates viewpoint selection, dynamic map construction, and adaptive hierarchical planning. Our method constructs a visual-topological map (VTMap) that fuses prior knowledge, object-room and object-object co-occurrence statistics, and spatial probability distributions modeled via Gaussian Mixture Models (GMM). The robot continuously generates and updates a room-level probability map, enabling systematic selection of optimal viewpoints. This process maximizes the likelihood of target detection while minimizing travel distance through a utility-based strategy. Multimodal sensory observations are represented as graph nodes, with navigation actions encoded as edges, supporting accurate localization and action planning. To complement global planning, we introduce a hierarchical search strategy that unifies long-term exploration objectives with adaptive local exploration informed by imitation learning. The agent dynamically adjusts its search direction by integrating prior experiences with real-time sensory cues. Local exploration is formulated as a partially observable Markov decision process (POMDP), guided by spatial memory and semantic targets. Furthermore, action cost modeling and an auxiliary inflection point prediction task refine the local exploration process, enabling the system to flexibly transition between global and local search strategies. Collectively, these components facilitate robust and efficient object-oriented navigation in complex and dynamic environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131538"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse identification of unsteady disturbance sources in mine ventilation systems 矿井通风系统非定常扰动源的逆识别
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-09 DOI: 10.1016/j.eswa.2026.131603
Yonghong Liu , Ziming Wang , Yupeng Xie , De Huang
{"title":"Inverse identification of unsteady disturbance sources in mine ventilation systems","authors":"Yonghong Liu ,&nbsp;Ziming Wang ,&nbsp;Yupeng Xie ,&nbsp;De Huang","doi":"10.1016/j.eswa.2026.131603","DOIUrl":"10.1016/j.eswa.2026.131603","url":null,"abstract":"<div><div>Maintaining stable airflow within mine ventilation systems is essential for ensuring safe and continuous underground operations. However, unsteady airflow disturbances induced by the intermittent movement of mine cars or hoisting cages generate transient, low-amplitude perturbations that are dynamically coupled across the ventilation network. These disturbances are superimposed on the steady mechanical ventilation field, producing unsteady signals that conventional steady-state models cannot effectively decouple or localize, leading to discrepancies between monitoring data and actual ventilation conditions. To address this challenge, a mathematical model was developed to characterize unsteady airflow disturbances in underground tunnels, and the dynamic effects of mine car movement on ventilation airflow were systematically analyzed. A network-based algorithm was further designed to solve the unsteady disturbance field, and a simulation platform was constructed to reproduce dynamic airflow behavior, showing minimal deviation from theoretical predictions. Building on this foundation, a hybrid Maximum Information Coefficient-Long Short-Term Memory (MIC-LSTM) neural network model was proposed for the inverse identification of unsteady disturbance sources. The Maximum Information Coefficient (MIC) was utilized to extract informative features from airflow velocity time-series data, while the LSTM network identified disturbance sources from temporal dependencies. Experimental results demonstrate that when the disturbance threshold is 0.1 and the monitoring coverage ratio is 0.3, all evaluation metrics approximately 90%. Validation in an operational mine ventilation system further confirms the model’s accuracy, robustness, and generalizability. This study establishes an artificial intelligence-driven framework for intelligent monitoring and control of unsteady disturbances, providing actionable insights toward safer and more efficient mine ventilation management.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131603"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction LDM-DTI:一个集成了预训练语言模型和几何图网络的多模态框架,用于可解释的药物-靶点相互作用预测
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131485
Yuanyuan Ji , Zhuo Chen , Zhihan Liu , Xiaofeng Man , Junwei Du , Bin Yu
{"title":"LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction","authors":"Yuanyuan Ji ,&nbsp;Zhuo Chen ,&nbsp;Zhihan Liu ,&nbsp;Xiaofeng Man ,&nbsp;Junwei Du ,&nbsp;Bin Yu","doi":"10.1016/j.eswa.2026.131485","DOIUrl":"10.1016/j.eswa.2026.131485","url":null,"abstract":"<div><div>Accurate prediction of drug-target interactions (DTIs) is essential for speeding up the discovery of new therapeutics. Although significant progress has been made with deep learning-based approaches, considerable challenges remain in learning informative molecular representations and modeling the intricate nature of drug-target associations. To overcome these limitations, an end-to-end predictive architecture, termed LDM-DTI, is proposed. In this framework, drug and protein sequences are encoded via pretrained large language models. Specifically, ChemBERTa is utilized to derive high-dimensional semantic and structural features from SMILES strings, while ProtBERT is employed to extract contextual representations from amino acid sequences. To further incorporate spatial molecular information, a three-layer Graph Convolutional Network (GCN) and an Equivariant Graph Neural Network (EGNN) are integrated to capture both 2D topological and 3D geometric characteristics of drug molecules. Protein-level features are refined through dynamic convolutional operations and multi-head self-attention mechanisms. These representations are then fused via a Dynamic Interactive Attention Module (DIAM) to model cross-modal dependencies between drugs and targets. The proposed framework demonstrates superior predictive performance and generalizability across four public benchmark datasets, consistently surpassing ten state-of-the-art baselines. Ablation experiments are conducted to quantify the contributions of individual components, and protein-level attention maps are visualized to enhance interpretability. Overall, LDM-DTI offers a robust and interpretable solution for DTI prediction, with strong potential for accelerating structure-informed drug discovery.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131485"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESG investment under different supply chain power structures: decisions, impacts, and the triple win 不同供应链权力结构下的ESG投资:决策、影响与三赢
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-03 DOI: 10.1016/j.eswa.2026.131518
Man Yu , Erbao Cao , Yu Zhang
{"title":"ESG investment under different supply chain power structures: decisions, impacts, and the triple win","authors":"Man Yu ,&nbsp;Erbao Cao ,&nbsp;Yu Zhang","doi":"10.1016/j.eswa.2026.131518","DOIUrl":"10.1016/j.eswa.2026.131518","url":null,"abstract":"<div><div>ESG serves as a key global metric for corporate green and sustainable development. This study investigates a manufacturer’s investment decisions on ESG subcategories (E, S, and G) and their impacts on enterprises, consumers and the environment across different supply chain power structures. The findings show that a dominant manufacturer invests less in pollutant emission reduction and corporate governance, whereas the highest ESG investment occurs under balanced power. Under ESG investment, an enterprise’s profit increases with power, yet consumer surplus peaks under balanced power. Regarding the impacts on the environment, it depends on potential market demand. Moreover, ESG investment consistently increases the retailer’s profit and consumer surplus. When the level of social responsibility commitment is below a threshold, it can increase the manufacturer’s profit. Furthermore, this work identifies the conditions for three different types of Pareto improvement, namely, the win–win scenario for firms and consumers, the win–win scenario for consumers and the environment, and the triple win–win scenario for the three parties. This study sheds light on how enterprises make decisions regarding ESG investment to realize economic and environmental benefits.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131518"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RVFormer: Keypoint-based fusion of 4D radar and vision for 3D object detection in autonomous driving RVFormer:基于关键点的四维雷达与视觉融合,用于自动驾驶中3D物体检测
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2026-05-25 Epub Date: 2026-02-03 DOI: 10.1016/j.eswa.2026.131497
Xin Bi , Caien Weng , Panpan Tong , Arno Eichberger , Lu Xiong
{"title":"RVFormer: Keypoint-based fusion of 4D radar and vision for 3D object detection in autonomous driving","authors":"Xin Bi ,&nbsp;Caien Weng ,&nbsp;Panpan Tong ,&nbsp;Arno Eichberger ,&nbsp;Lu Xiong","doi":"10.1016/j.eswa.2026.131497","DOIUrl":"10.1016/j.eswa.2026.131497","url":null,"abstract":"<div><div>Multi-modal fusion is crucial in autonomous driving perception, enhancing reliability, completeness, and accuracy, which extends the performance limits of perception systems. Specifically, large-scale perception through 4D radar and vision fusion has become a key research focus aimed at improving driving safety, enhancing complex scene understanding, and supporting fine-grained local planning and control. However, existing 3D object detection methods typically rely on fixed-voxel representations to maintain detection accuracy. As the perception range increases, these methods incur considerable computational overhead. While transformer-based query methods show strong potential in capturing dependencies over large receptive fields in image-domain tasks, their application in radar-vision fusion is limited due to radar point cloud sparsity and cross-modal alignment challenges. To address these limitations, we propose RVFormer, a dual-branch feature-level fusion network that uses a sparse keypoint-based query strategy to integrate features from both modalities, thereby mitigating the impact of large-scale scenes on inference speed. Additionally, we introduce clustered voxel query initialization (CVQI) to accelerate convergence and enhance object localization. By incorporating the radar voxel painter (RVP), radar-image cross-attention (RICA), and gated adaptive fusion (GAF) modules, our framework enables deep and adaptive fusion of radar and visual features, effectively mitigating issues caused by point cloud sparsity and modality inconsistency. Compared to existing radar-vision fusion models, RVFormer demonstrates competitive performance, with an inference speed of approximately 15.2 frames per second. It delivers accuracy comparable to CNN-based approaches, while outperforming baseline methods by at least 4.72% in 3D mean average precision and 5.82% in bird’s-eye view mean average precision.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131497"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书