Expert Systems with Applications最新文献

筛选
英文 中文
CoCNet: A Chain-of-Clues framework for zero-shot referring expression comprehension 一个线索链框架,用于零射击引用表达式理解
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-19 DOI: 10.1016/j.eswa.2025.127633
Xuanyu Zhou , Simin Zhang , Zengcan Xue , Xiao Lu , Tianxing Xiao , Lianhua Wu , Lin Liu , Xuan Li
{"title":"CoCNet: A Chain-of-Clues framework for zero-shot referring expression comprehension","authors":"Xuanyu Zhou ,&nbsp;Simin Zhang ,&nbsp;Zengcan Xue ,&nbsp;Xiao Lu ,&nbsp;Tianxing Xiao ,&nbsp;Lianhua Wu ,&nbsp;Lin Liu ,&nbsp;Xuan Li","doi":"10.1016/j.eswa.2025.127633","DOIUrl":"10.1016/j.eswa.2025.127633","url":null,"abstract":"<div><div>Zero-shot learning enables the reference expression comprehension (REC) model to adapt to a wide range of visual domains without training. However, the ambiguity of linguistic expression leads to the lack of a clear subject. Moreover, existing methods have not fully utilized the visual context and spatial information, resulting in low accuracy and robustness in complex scenes. To address these problems, we propose a Chain-of-Clues framework (CoCNet) to exploit multiple clues for zero-shot REC task to solve the inference confusion step by step. First, <strong>the subject clue module</strong> employs the strong ability of large language models (LLMs) to reason about the category in expression, which enhances the clarity of linguistic expression. In <strong>the attribute clue module</strong>, we propose the dual-track scoring which highlights the proposal by blurring its surroundings and enhances contextual sensitivity by blurring the proposal. Additionally, <strong>the spatial clue module</strong> utilizes a series of Gaussian-based soft heuristic rules to model the location words and the spatial relationship of the image. Experimental results show that CoCNet exhibits strong generalization capabilities in complex scenes. It significantly outperforms previous state-of-the-art zero-shot methods on RefCOCO, RefCOCO+, RefCOCOg, Flickr-Split-0 and Flickr-Split-1. Our code is released at <span><span>https://github.com/CoCNetHub/CoCNet-main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127633"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-end multi-scale attention convolutional recurrent network for online handwritten Chinese text recognition 端到端多尺度关注卷积递归网络用于在线手写体中文文本识别
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-19 DOI: 10.1016/j.eswa.2025.127626
Xiwen Qu , Zhihong Wu
{"title":"End-to-end multi-scale attention convolutional recurrent network for online handwritten Chinese text recognition","authors":"Xiwen Qu ,&nbsp;Zhihong Wu","doi":"10.1016/j.eswa.2025.127626","DOIUrl":"10.1016/j.eswa.2025.127626","url":null,"abstract":"<div><div>Nowadays, for Online handwritten Chinese text recognition (OHCTR), convolutional recurrent network (CRN) based models have achieved excellent recognition performance. However, existing CRN methods cannot directly process chronological sequence coordinates of online handwritten Chinese text lines, overlook multi-scale local semantic context, and fail to capture multi-level dependencies between characters. To address the above issues and further improve recognition performance, this paper proposes an end-to-end multi-scale attention convolutional recurrent network (EMACRN) for OHCTR. Concretely, this study proposes an end-to-end multi-scale attention convolutional neural network to directly extract multi-scale local contextual features from original chronological sequence coordinates. Then, bidirectional long short-term memory (BiLSTM) is used to capture the correlation between the multi-scale local contextual features and obtain the temporal sequence features of the local context features. After BiLSTM, multi-head attention is employed to weigh the outputs of BiLSTM. Finally, focal connectionist temporal classification (FCTC) is utilized to make predictions. Experiments on three public datasets demonstrate that EMACRN obtains a higher accuracy rate (<span><math><mrow><mi>A</mi><mi>R</mi></mrow></math></span>) and correct rate (<span><math><mrow><mi>C</mi><mi>R</mi></mrow></math></span>) with faster computation speed and less storage cost compared with the state-of-the-art algorithms on OHCTR.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127626"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143859163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigations into deep Reinforcement Learning for wind farm set-point optimisation 风电场设定点优化的深度强化学习研究
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-19 DOI: 10.1016/j.eswa.2025.127627
Helen Sheehan , Daniel Poole , Telmo Silva Filho , Ervin Bossanyi , Lars Landberg
{"title":"Investigations into deep Reinforcement Learning for wind farm set-point optimisation","authors":"Helen Sheehan ,&nbsp;Daniel Poole ,&nbsp;Telmo Silva Filho ,&nbsp;Ervin Bossanyi ,&nbsp;Lars Landberg","doi":"10.1016/j.eswa.2025.127627","DOIUrl":"10.1016/j.eswa.2025.127627","url":null,"abstract":"<div><div>Wake steering is a form of wind farm flow control in which upstream turbines are deliberately yawed to misalign with the incoming wind in order to reduce the impact of wakes on downstream turbines. This technique can give a net increase in the power generated by an array of turbines compared to standard greedy control where each turbine acts for its own benefit by aligning with the incoming wind. However, optimising the set-points of multiple turbines under varying wind conditions can be prohibitively complex for traditional, white-box models. Reinforcement Learning (RL) agents learn optimal long-term behaviours through “trial-and-error”, making them suited to controlling arrays of wind turbines under changing wind conditions for maximum farm power. Related works applying RL to this problem have tended to concentrate on either single wind directions or ranges up to around <span><math><mrow><mo>±</mo><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>∘</mo></mrow></msup></mrow></math></span>. Here, the Deep Deterministic Policy Gradient algorithm has been used to train RL agents to control a nine-turbine array to implement wake steering under multiple wind directions between <span><math><mrow><mo>±</mo><mn>4</mn><msup><mrow><mn>5</mn></mrow><mrow><mo>∘</mo></mrow></msup></mrow></math></span>. While the agents were trained on steady-state (time-averaged) wind flow data, the performance of the final agent was tested on “quasi-dynamic” wind flow with varying wind direction. Under these conditions, the final agent achieved an average of 7% more power than greedy control per direction. This agent was then used to control the wind farm under a smaller subset of directions including many not seen during training, gaining on average 17% additional farm power per direction compared to greedy control.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127627"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143876885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimizing task offloading in IIoT via intelligent resource allocation and profit maximization in fog computing 通过雾计算中的智能资源分配和利润最大化优化工业物联网中的任务卸载
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-19 DOI: 10.1016/j.eswa.2025.127810
Chia-Cheng Hu
{"title":"Optimizing task offloading in IIoT via intelligent resource allocation and profit maximization in fog computing","authors":"Chia-Cheng Hu","doi":"10.1016/j.eswa.2025.127810","DOIUrl":"10.1016/j.eswa.2025.127810","url":null,"abstract":"<div><div>The rapid growth of Internet of Things (IoT) technology has revolutionized industrial and manufacturing sectors, with the Industrial Internet of Things (IIoT) playing a central role in enhancing operational efficiency. However, IIoT applications are challenged by limited computational and power resources, which impact the Quality of Service (QoS) requirements. While cloud computing alleviates some of these challenges, it introduces latency and server overload, leading to delays in task processing. Fog computing offers a promising solution by reducing latency and deploying computationally capable nodes at the network edge.</div><div>This paper proposes a novel framework for optimizing task offloading in IIoT environments by focusing on intelligent resource allocation and profit maximization within a fog computing architecture. Unlike traditional methods, our approach integrates a unified cost function that simultaneously addresses task delay and energy consumption, improving efficiency by balancing these conflicting objectives. We present an Integer Linear Programming (ILP) model that minimizes the total offloading cost while adhering to strict power and resource constraints. To handle the NP-hard nature of ILP problems, we introduce a computationally efficient approximation method based on rounding techniques, achieving near-optimal solutions without excessive computational overhead.</div><div>A key novelty of our work is the inclusion of profit maximization for IIoT application providers, which is often overlooked in existing solutions. We develop a second ILP model specifically for profit optimization, supported by an efficient solution method. Additionally, we propose a strategic resource expansion algorithm that adapts to insufficient system resources, ensuring the alignment of available resources with application demands. Our simulations demonstrate the practical impact of this approach, showcasing significant improvements in task processing time and energy efficiency, as well as optimizing profitability in real-world IIoT applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127810"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based financial risk early warning model for listed companies: A multi-dimensional analysis approach 基于深度学习的上市公司财务风险预警模型:多维度分析方法
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-19 DOI: 10.1016/j.eswa.2025.127746
Pengyu Chen , Mingjun Ji
{"title":"Deep learning-based financial risk early warning model for listed companies: A multi-dimensional analysis approach","authors":"Pengyu Chen ,&nbsp;Mingjun Ji","doi":"10.1016/j.eswa.2025.127746","DOIUrl":"10.1016/j.eswa.2025.127746","url":null,"abstract":"<div><div>This study proposes a novel deep learning-based approach for financial risk early warning in listed companies through a hierarchical attention network that integrates multi-dimensional data sources. Traditional financial risk prediction models often struggle with complex non-linear relationships and fail to effectively combine diverse information types. We develop a comprehensive framework that simultaneously processes financial statements, market trading data, and textual information through specialized neural network components. The model employs a two-level attention mechanism that dynamically weights both individual features and information sources, enabling interpretable risk assessment. Using data from 2,876 Chinese A-share listed companies from 2015 to 2024, our empirical analysis demonstrates that the proposed model achieves superior predictive performance (AUC-ROC: 0.873) compared to traditional statistical approaches (0.742–0.768) and conventional machine learning methods (0.812–0.845). The model provides early warning signals approximately 4.2 months before actual distress events, significantly outperforming benchmark models (2.3–3.7 months). Notably, the model maintains robust performance during market stress periods (accuracy: 0.798) compared to traditional models (accuracy: 0.678). The attention mechanism reveals that the relative importance of different risk indicators varies systematically with market conditions, with financial ratios dominating during stable periods (weight: 0.435) and market signals becoming more crucial during crises (weight: 0.412). These findings contribute to both the theoretical understanding of financial risk dynamics and practical risk management applications, while demonstrating the effectiveness of interpretable deep learning approaches in financial analysis.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"283 ","pages":"Article 127746"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143879467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid path planning algorithm for robots based on modified golden jackal optimization method and dynamic window method 基于改进金豺优化法和动态窗口法的机器人混合路径规划算法
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-18 DOI: 10.1016/j.eswa.2025.127808
Yuchao Wang , Kelin Tong , Chunhai Fu , Yuhang Wang , Qiuhua Li , Xingni Wang , Yunzhe He , Lijia Xu
{"title":"Hybrid path planning algorithm for robots based on modified golden jackal optimization method and dynamic window method","authors":"Yuchao Wang ,&nbsp;Kelin Tong ,&nbsp;Chunhai Fu ,&nbsp;Yuhang Wang ,&nbsp;Qiuhua Li ,&nbsp;Xingni Wang ,&nbsp;Yunzhe He ,&nbsp;Lijia Xu","doi":"10.1016/j.eswa.2025.127808","DOIUrl":"10.1016/j.eswa.2025.127808","url":null,"abstract":"<div><div>Traditional path planning algorithms still face significant challenges in large-scale scenarios with high-density irregular obstacles, such as low search efficiency, limited obstacle avoidance capabilities, and a tendency to get trapped in local optimum. To overcome these challenges, a hybrid route planning algorithm combining the Modified Golden Jackal Optimization (MGJO) algorithm and the Improved Dynamic Window Approach (IDWA) is proposed. To resolve the issue of getting trapped in local optimum and enhance global search efficiency in global path planning, the MGJO algorithm is synthesized based on nonlinear energy attenuation, diverse search strategies, and a guiding mechanism inspired by African vultures. To improve obstacle avoidance efficiency and ensure smoother local paths, the IDWA algorithm is redesigned by optimizing the obstacle distance evaluation function. In global path planning, the MGJO algorithm is evaluated against some state-of-the-art optimizers on 23 benchmark functions. In three different environments, the average path length of the MGJO algorithm over the original algorithm is improved by 10.76%, 16.72% and 25.46%. In local path planning experiments for mobile robots, the IDWA algorithm avoids the local optimum in small and medium-sized maps. In large maps, it significantly reduces the number of the local optimum occurrences, from 6 times to 2 times. The feasibility of the algorithm is validated in real-world experiments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127808"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143859162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting wind turbines faults using Multi-Objective Genetic Programming 基于多目标遗传规划的风力发电机故障预测
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-18 DOI: 10.1016/j.eswa.2025.127487
Marwa Daaji , Mohamed-Amin Benatia , Ali Ouni , Mohamed Mohsen Gammoudi
{"title":"Predicting wind turbines faults using Multi-Objective Genetic Programming","authors":"Marwa Daaji ,&nbsp;Mohamed-Amin Benatia ,&nbsp;Ali Ouni ,&nbsp;Mohamed Mohsen Gammoudi","doi":"10.1016/j.eswa.2025.127487","DOIUrl":"10.1016/j.eswa.2025.127487","url":null,"abstract":"<div><div>Wind turbines are a key component of renewable energy, converting wind into electricity with minimal environmental impact. Ensuring their continuous operation is crucial for maximizing energy production and reducing costly downtimes. To extend their operational lifespan, proactive maintenance strategies that predict and address potential faults are essential. While Machine Learning (ML) and Deep Learning (DL) algorithms have demonstrated significant promise in detecting wind turbine faults, they often prioritize maximizing the detection of failures without giving sufficient attention to false alarms. In practice, false alarms are just as problematic as undetected failures, as they reduce efficiency and waste resources. In this paper, we propose a novel optimization approach using Multi-Objective Genetic Programming (MOGP) to predict wind turbine faults. Our approach seeks to identify the best combination of features and their threshold values by optimizing two conflicting objectives: maximizing fault detection while minimizing false alarms. This dual-objective strategy ensures reliable fault prediction while minimizing unnecessary maintenance actions. We assess the effectiveness of our approach using real-world Supervisory Control and Data Acquisition (SCADA) data from a wind turbine in southern Ireland. The results demonstrate the efficiency of our approach in fault identification, achieving a competitive balance between recall (91%) and false positive rate (21%). While machine learning (ML), specifically Random Forest (RF), shows promising performance with a recall of 91% and a 10% false alarm rate, it remains a black-box model. RF lacks interpretability, making it challenging to extract meaningful insights into the relationships between sensor features and fault occurrences.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"281 ","pages":"Article 127487"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143842931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven algorithm for temperature predictions and corrections from low-resolution thermal images at fire scenes 从火灾现场的低分辨率热图像进行温度预测和校正的数据驱动算法
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-18 DOI: 10.1016/j.eswa.2025.127771
Yichuan Dong, Jian Jiang, Wei Chen, Jihong Ye
{"title":"Data-driven algorithm for temperature predictions and corrections from low-resolution thermal images at fire scenes","authors":"Yichuan Dong,&nbsp;Jian Jiang,&nbsp;Wei Chen,&nbsp;Jihong Ye","doi":"10.1016/j.eswa.2025.127771","DOIUrl":"10.1016/j.eswa.2025.127771","url":null,"abstract":"<div><div>An accurate and efficient temperature measurement at fire scenes is crucial for structural safety predictions and fire emergency responses. The application of thermal images provides advantages of spatial and stable measurements over thermocouples. A data-driven algorithmic system for temperature measurement is proposed, utilizing thermal images and comprising a sequence of resolution enhancements, temperature predictions, and error corrections. The system starts with transformation of low-resolution images to super-resolution ones through convolutional neural networks (CNN) with hybrid scaling factors and attention fusion post-residual blocks. The temperatures are predicted from super-resolution thermal images based on cascade feedforward neural networks (CFNN) using a two-stage temperature division strategy. The errors of temperature predictions are corrected by comparing results between thermal images and thermocouples. The effectiveness, influencing factor and optimization strategy of the proposed system are validated through a series of large-scale fire tests. The mean absolute errors of temperature prediction models are within 20 °C, while over 70 % of error correction results are within ±30 °C. The proposed algorithm provides an effective tool to predict and correct temperature fields, aiming at a fast and smart fire emergency decision-making.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127771"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143855296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fact retrieval from knowledge graphs through semantic and contextual attention 通过语义和上下文注意从知识图中检索事实
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-18 DOI: 10.1016/j.eswa.2025.127612
Akhil Chaudhary , Enayat Rajabi , Somayeh Kafaie , Evangelos Milios
{"title":"Fact retrieval from knowledge graphs through semantic and contextual attention","authors":"Akhil Chaudhary ,&nbsp;Enayat Rajabi ,&nbsp;Somayeh Kafaie ,&nbsp;Evangelos Milios","doi":"10.1016/j.eswa.2025.127612","DOIUrl":"10.1016/j.eswa.2025.127612","url":null,"abstract":"<div><div>Knowledge Graphs (KGs), such as DBpedia and ConceptNet, enhance Natural Language Processing (NLP) applications by providing structured information. However, extracting accurate data from KGs is challenging due to issues in entity detection, disambiguation, and relation classification, which often lead to errors and inefficiencies. We introduce <strong>Attention2Query (A2Q)</strong>, an attention-driven approach that directly ranks and selects the most relevant facts, thus minimizing error propagation. A2Q centres on three key contributions: (1) <em>Focused Node Selection</em>, which streamlines graph traversal; (2) <em>Global Attention Alignment</em>, improving retrieval by comparing facts against the query text; and (3) <em>Contextual Re-ranking</em>, enabling on-the-fly adjustments of fact importance based on evolving query context. Experimental results across multiple tasks and datasets show that A2Q substantially outperforms baseline methods, including those in zero-shot settings, achieving higher retrieval accuracy with reduced computational overhead.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127612"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143869515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SABA: Scale-adaptive Attention and Boundary Aware Network for real-time semantic segmentation 实时语义分割的尺度自适应注意和边界感知网络
IF 7.5 1区 计算机科学
Expert Systems with Applications Pub Date : 2025-04-18 DOI: 10.1016/j.eswa.2025.127680
Huilan Luo , Chunyan Liu , Lik-Kwan Shark
{"title":"SABA: Scale-adaptive Attention and Boundary Aware Network for real-time semantic segmentation","authors":"Huilan Luo ,&nbsp;Chunyan Liu ,&nbsp;Lik-Kwan Shark","doi":"10.1016/j.eswa.2025.127680","DOIUrl":"10.1016/j.eswa.2025.127680","url":null,"abstract":"<div><div>Balancing accuracy and speed is crucial for semantic segmentation in autonomous driving. While various mechanisms have been explored to enhance segmentation accuracy in lightweight deep learning networks, adding more mechanisms does not always lead to better performance and often significantly increases processing time. This paper investigates a more effective and efficient integration of three key mechanisms — context, attention, and boundary — to improve real-time semantic segmentation of road scene images. Based on an analysis of recent fully convolutional encoder–decoder networks, we propose a novel Scale-adaptive Attention and Boundary Aware (SABA) segmentation network. SABA enhances context through a new pyramid structure with multi-scale residual learning, refines attention via scale-adaptive spatial relationships, and improves boundary delineation using progressive refinement with a dedicated loss function and learnable weights. Evaluations on the Cityscapes benchmark show that SABA outperforms current real-time semantic segmentation networks, achieving a mean intersection over union (mIoU) of up to 76.7% and improving accuracy for 17 out of 19 object classes. Moreover, it achieves this accuracy at an inference speed of up to 83.4 frames per second, significantly exceeding real-time video frame rates. The code is available at <span><span>https://github.com/liuchunyan66/SABA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127680"},"PeriodicalIF":7.5,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143864810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信