Applied Soft Computing最新文献

筛选
英文 中文
ClipCap+ +: An efficient image captioning approach via image encoder optimization and LLM fine-tuning ClipCap+ +:通过图像编码器优化和LLM微调的高效图像字幕方法
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-11 DOI: 10.1016/j.asoc.2025.113469
Ruiqin Wang , Ye Wu , Zhenzhen Sheng
{"title":"ClipCap+ +: An efficient image captioning approach via image encoder optimization and LLM fine-tuning","authors":"Ruiqin Wang ,&nbsp;Ye Wu ,&nbsp;Zhenzhen Sheng","doi":"10.1016/j.asoc.2025.113469","DOIUrl":"10.1016/j.asoc.2025.113469","url":null,"abstract":"<div><div>ClipCap (CLIP prefix for image captioning), a leading image captioning model, exhibits limitations in recognizing images within specific domains. This study presents ClipCap+ +, an enhanced version of ClipCap that integrates key-value pair and residual connection modules. The key-value pair module implements a few-shot learning strategy by incorporating domain-specific knowledge, thereby improving the model's capability to recognize specialized image categories. The residual connection module optimizes the weight distribution between the pre-trained model and the key-value pair module, enhancing the model's transfer learning performance. During the inference phase, the model processes an input image through a multi-stage pipeline: (1) the visual encoder extracts image features to generate a hard visual prompt, (2) the key-value pair module dynamically constructs a domain-specific soft prompt, and (3) these complementary prompts are jointly fed into the large language model to synthesize the final image description. Extensive experiments on in-domain, near-domain, and cross-domain tasks show ClipCap+ + surpasses state-of-the-art models in accuracy, training efficiency, and generalization.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113469"},"PeriodicalIF":7.2,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Heuristic information-rich evolutionary modelling for engine soft sensors of hybrid electric vehicles 混合动力汽车发动机软传感器的启发式富信息进化建模
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-11 DOI: 10.1016/j.asoc.2025.113468
Ji Li , Xu He , Quan Zhou , Carl Anthony , Bo Wang , Guoxiang Lu , Hongming Xu
{"title":"Heuristic information-rich evolutionary modelling for engine soft sensors of hybrid electric vehicles","authors":"Ji Li ,&nbsp;Xu He ,&nbsp;Quan Zhou ,&nbsp;Carl Anthony ,&nbsp;Bo Wang ,&nbsp;Guoxiang Lu ,&nbsp;Hongming Xu","doi":"10.1016/j.asoc.2025.113468","DOIUrl":"10.1016/j.asoc.2025.113468","url":null,"abstract":"<div><div>Under the explosive demand of the electrified powertrain market, modelling schemes with strong robustness, low cost, and fast implementation are urgently required for hybrid vehicle engine development. This paper presents a data-driven holistic solution integrated with heuristic information-rich feature selection for engine soft sensors, i.e., fuel consumption, thermal efficiency, and volumetric efficiency, namely heuristic information-rich warm-start evolutionary modelling framework. Five filter methods are developed as heaters, and their selected features are converted to warm up the initialisation process in the evolutionary modelling, alleviating the inefficient exploration and local optimal problems caused by the pseudo-random initialisation of a single wrapper during the optimisation process. Meanwhile, a new factor of heuristic information richness is introduced to determine and adjust the proportion of the filter particles, further accelerate evolutionary convergence through the filter information guidance and avoid local optimality through free exploration of the particles without filter information, achieving a balance between computational efficiency and global search capability. Validated by the testing bench of a BYD 1.5 L naturally aspirated engine specially made for a hybrid powertrain, the Lasso method is the best heater and helps the proposed framework to reduce up to 54.9 % of mean squared error compared to that of the cold-start one. Compared to industry-used modelling frameworks, the proposed one achieves the equivalent prediction performance while reducing the database size by up to 85 %.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113468"},"PeriodicalIF":7.2,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Intelligent decision support system for multi-objective 3D container loading using genetic algorithm combined with artificial bee colony 基于遗传算法和人工蜂群的多目标三维集装箱装载智能决策支持系统
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113473
Suriya Phongmoo , Komgrit Leksakul , Chaichana Suedumrong , Chakkrapong Kuensaen
{"title":"Intelligent decision support system for multi-objective 3D container loading using genetic algorithm combined with artificial bee colony","authors":"Suriya Phongmoo ,&nbsp;Komgrit Leksakul ,&nbsp;Chaichana Suedumrong ,&nbsp;Chakkrapong Kuensaen","doi":"10.1016/j.asoc.2025.113473","DOIUrl":"10.1016/j.asoc.2025.113473","url":null,"abstract":"<div><div>Efficient container loading is a complex and critical logistics challenge, especially when dealing with strongly heterogeneous boxes in three dimensions. This study proposes an intelligent decision support system that addresses the 3D Single Container Loading Problem (3D-SCLP) using a hybrid meta-heuristic approach combining Genetic Algorithm (GA) and Artificial Bee Colony (ABC). The system introduces rotation constraints as a decision variable and optimizes for two objectives: maximizing profit and minimizing unused space. A mathematical model based on the bottom-left fill (BLF) method was developed to ensure feasible loading with non-overlapping placements and valid rotations. Experimental results on 15 real-world and 225 synthetic test cases demonstrate the superiority of the proposed GA+ABC method over standalone algorithms in both solution quality and robustness. The system achieves the lowest hypervolume metric (119.28), indicating better convergence to Pareto-optimal fronts, and provides practical feasibility for real-world logistics optimization.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113473"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DFSMCG-Net: A Siamese change detection network based on Differential Feature Selection and Multi-Scale Guidance Strategies DFSMCG-Net:基于差分特征选择和多尺度引导策略的连体变化检测网络
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113372
Hang Xue, Ke Liu, Caiyi Huang, Xianhong Meng
{"title":"DFSMCG-Net: A Siamese change detection network based on Differential Feature Selection and Multi-Scale Guidance Strategies","authors":"Hang Xue,&nbsp;Ke Liu,&nbsp;Caiyi Huang,&nbsp;Xianhong Meng","doi":"10.1016/j.asoc.2025.113372","DOIUrl":"10.1016/j.asoc.2025.113372","url":null,"abstract":"<div><div>Change detection technology effectively identifies surface changes but encounters significant challenges, including class imbalance between foreground and background and interference from pseudo-changes caused by factors such as illumination variations and geometric distortions. We propose a location-sensitive Differential Feature Selection and Multi-Scale Change Feature Guidance Network (DFSMCG-Net) to address these issues. The DFSMCG-Net introduces a Differential Feature Selection Module (DFSM) that leverages the spatial location information of bi-temporal features. This module captures spatiotemporal differential features at the exact location along the X-axis and Y-axis and integrates these features through cross-fusion to establish long-range pixel dependencies. The resulting multi-level differential features provide the network with a detailed temporal context for detecting changes. We develop a Multi-Scale Change Feature Guidance Module (MCFGM) based on a multi-head self-attention mechanism to further enhance the fusion of multi-level differential features and suppress interference from non-differential features. This module assigns each attention head a distinct non-overlapping window, dynamically adjusting window sizes according to the feature map dimensions. This approach facilitates the integration of multi-scale differential features, improving the network’s capacity to represent change-related features. Experimental results demonstrate that the proposed DFSMCG-Net performs significantly better than state-of-the-art methods on benchmark datasets, including LEVIR-CD, CDD, SYSU-CD and S2Looking. The model is particularly effective in mitigating pseudo-change phenomena under conditions of extreme class imbalance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113372"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A bilayer segmentation-recombination network for accurate segmentation of overlapping C. elegans 一种用于重叠秀丽隐杆线虫精确分割的双层分割重组网络
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113459
Mengqian Ding , Jun Liu , Yang Luo , Jinshan Tang
{"title":"A bilayer segmentation-recombination network for accurate segmentation of overlapping C. elegans","authors":"Mengqian Ding ,&nbsp;Jun Liu ,&nbsp;Yang Luo ,&nbsp;Jinshan Tang","doi":"10.1016/j.asoc.2025.113459","DOIUrl":"10.1016/j.asoc.2025.113459","url":null,"abstract":"<div><div>Caenorhabditis elegans (<em>C. elegans</em>) is an excellent model organism because of its short lifespan and high degree of homology with human genes, and it has been widely used in a variety of human health and disease models. However, the segmentation of <em>C. elegans</em> remains challenging due to the following reasons: 1) the activity trajectory of <em>C. elegans</em> is uncontrollable, and multiple nematodes often overlap, resulting in blurred boundaries of <em>C. elegans</em>. This makes it impossible to clearly study the life trajectory of a certain nematode; and 2) in the microscope images of overlapping <em>C. elegans</em>, the translucent tissues at the edges obscure each other, leading to inaccurate boundary segmentation. To solve these problems, a Bilayer Segmentation-Recombination Network (BR-Net) for the segmentation of <em>C. elegans</em> instances is proposed. The network consists of three parts: A Coarse Mask Segmentation Module (CMSM), a Bilayer Segmentation Module (BSM), and a Semantic Consistency Recombination Module (SCRM). The CMSM is used to extract the coarse mask, and we introduce a United Attention Module (UAM) in CMSM to make CMSM better aware of nematode instances. The Bilayer Segmentation Module (BSM) segments the aggregated <em>C. elegans</em> into overlapping and non-overlapping regions. This is followed by integration by the SCRM, where semantic consistency regularization is introduced to segment nematode instances more accurately. Finally, the effectiveness of the method is verified on the <em>C. elegans</em> dataset. The experimental results show that BR-Net exhibits good competitiveness and outperforms other recently proposed segmentation methods in processing <em>C. elegans</em> occlusion images.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113459"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regularizing Model Predictive Control for pixel-based long-horizon tasks 基于像素的长视界任务的正则化模型预测控制
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113377
Yao-Hui Li, Feng Zhang, Qiang Hua, Chun-Ru Dong
{"title":"Regularizing Model Predictive Control for pixel-based long-horizon tasks","authors":"Yao-Hui Li,&nbsp;Feng Zhang,&nbsp;Qiang Hua,&nbsp;Chun-Ru Dong","doi":"10.1016/j.asoc.2025.113377","DOIUrl":"10.1016/j.asoc.2025.113377","url":null,"abstract":"<div><div>Planning has been proven to be an effective strategy for dealing with complex tasks in environments. However, due to the constraints of computational budget and the accumulated model biases, planning for pixel-based long horizon tasks with limited samples remains a great challenge. To address this issue, a <strong>R</strong>egularized <strong>M</strong>odel <strong>P</strong>redictive <strong>C</strong>ontrol (<strong>RMPC</strong>) was proposed in this study. RMPC performs trajectory optimization using short-term reward estimates and long-term return estimates, which avoids the high burden of long-horizon planning. Additionally, an implicit regularization mechanism is employed to improve the robustness of the generated environment model and reliability of the value function estimation, which helps to reduce the risk of accumulated model biases. Extensive comparison experiments and ablation studies are performed on the benchmark datasets for evaluating the proposed RMPC. And empirical results show that RMPC outperforms the previous SOTA algorithms in terms of sample-efficiency (20.88% performance improvement) and model stability (56.39% standard deviation reduction) on pixel-based continuous control tasks from DMControl-100k benchmark. Our code is available at: <span><span>https://github.com/Arya87/RMPC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113377"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144330809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning: Historical overview from inception to actualization, models, applications and future trends 深度学习:从开始到实现,模型,应用和未来趋势的历史概述
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113378
Olufisayo S. Ekundayo, Absalom E. Ezugwu
{"title":"Deep learning: Historical overview from inception to actualization, models, applications and future trends","authors":"Olufisayo S. Ekundayo,&nbsp;Absalom E. Ezugwu","doi":"10.1016/j.asoc.2025.113378","DOIUrl":"10.1016/j.asoc.2025.113378","url":null,"abstract":"<div><div>Deep learning stands at the forefront of contemporary machine learning techniques and is well-known for its outstanding predictive accuracy, adaptability to data variability, and remarkable ability to generalize across diverse domains. These attributes have spurred rapid progress and the emergence of novel iterations within the discipline. Yet, this swift evolution often obscures the foundational breakthroughs, with even trailblazing researchers at risk of fading into obscurity despite their seminal contributions. This study aims to provide a historical narrative of deep learning, tracing its origins from the cybernetic era to its current state-of-the-art status. We critically examine the contributions of individual pioneer scholars who have profoundly influenced the development of deep neural networks under the taxonomy of supervised, unsupervised, and reinforcement learning. Furthermore, the study also discusses the trending deep neural network architectures, explaining their operational principles, confronting associated challenges, exploring real-world applications, and outlining potential future trajectories that could offer a starting point for aspiring researchers in the field.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113378"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multilevel probabilistic wind power forecasting using an adaptive Informer network 基于自适应Informer网络的多级概率风电预测
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113460
Sen Xie , Yuyang Hua , Shan Lu , Xin Jin
{"title":"Multilevel probabilistic wind power forecasting using an adaptive Informer network","authors":"Sen Xie ,&nbsp;Yuyang Hua ,&nbsp;Shan Lu ,&nbsp;Xin Jin","doi":"10.1016/j.asoc.2025.113460","DOIUrl":"10.1016/j.asoc.2025.113460","url":null,"abstract":"<div><div>Effective and feasible wind power forecasting is critical to the resource allocation and safe control of the power system. Nevertheless, the volatility and randomness of wind speed changing leads to deviations in actual wind power output. Therefore, a multilevel probabilistic wind power forecasting strategy using an adaptive Informer network is developed. To separate the long-term trend and periodic fluctuation of the raw series, wind power is firstly decomposed into equal-length sequences of multilevel frequencies through the maximum discrete overlapping wavelet transform (MODWT). Simultaneously, a piecewise adaptive loss function and an activation function for large range are considered in a novel Informer network, and the inherent structure and nonlinear features at each frequency are extracted with two layers of encoders and one layer of decoders. Moreover, the ensemble batch prediction intervals (EnbPI) are exploited to extend the deterministic forecasting to probabilistic information. Ultimately, a historical dataset is applied from an offshore wind power system in Belgium to verify that the forecasting performance, and quantitative analysis shows that the model achieves a mean absolute error of 2.5 % and a root mean squared error of 3.8 %. The developed strategy handles the volatility and complexity of wind data, providing reliable support for real wind power plant.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113460"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144270806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive feature mixing with Vision Transformers for clinical image analysis 基于视觉变换的自适应特征混合临床图像分析
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113259
Susmita Ghosh, Swagatam Das
{"title":"Adaptive feature mixing with Vision Transformers for clinical image analysis","authors":"Susmita Ghosh,&nbsp;Swagatam Das","doi":"10.1016/j.asoc.2025.113259","DOIUrl":"10.1016/j.asoc.2025.113259","url":null,"abstract":"<div><div>The Vision Transformer (ViT) is an adaptation of the Transformer architecture that shows promise in image classification. However, limited training samples and the complex attributes of such images hinder its performance in identifying medical conditions from clinical images. To address this challenge, we propose a modified ViT architecture called ReMixViT by incorporating an efficient MLP-Mixer layer and reordering the residual blocks within the encoder block. This modification improves feature mixing and enhances the model’s generalization ability. We enhanced ReMixViT by incorporating an efficient MLP-Mixer layer. Additionally, we design two hybrid architectures, Res-ReMixViT and Res-ReMixViT+, by integrating a Convolutional Neural Network (ResNet50) and ReMixViT encoder blocks, considering feature maps of single and multiple scales, respectively. We evaluated the proposed architectures using six diverse medical imaging datasets with varying modalities and medical conditions. Our comparative study reveals that the ReMixViT and hybrid models outperform the vanilla ViT models and hybrid models with ViT encoder blocks, respectively, based on widely accepted performance measures. Specifically, we observe improvements of 4.62% and 3.08% in the F1-score performance metric. Moreover, when combined with data augmentation algorithms, the proposed hybrid architectures surpass other state-of-the-art hybrid networks. In addition to performance evaluation, we provide visual explanations through attention maps and the gradient flow of our model. These visual explanations contribute to the interpretability of the Artificial Intelligence (AI) system, assisting medical practitioners in drawing inferences from an explainable AI perspective. Moreover, an extended study demonstrates that the proposed modifications can be successfully adapted to other vision transformer architectures, resulting in enhanced performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"181 ","pages":"Article 113259"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144322519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A scalable monocular 3D detector with Superpixel Feature Pyramid Network 基于超像素特征金字塔网络的可扩展单目三维探测器
IF 7.2 1区 计算机科学
Applied Soft Computing Pub Date : 2025-06-10 DOI: 10.1016/j.asoc.2025.113389
Dongliang Ma , Fang Zhao , Ye Li , Xin Qu , Xin Jiang , Hao Wu , Xi Chen , Min Liu
{"title":"A scalable monocular 3D detector with Superpixel Feature Pyramid Network","authors":"Dongliang Ma ,&nbsp;Fang Zhao ,&nbsp;Ye Li ,&nbsp;Xin Qu ,&nbsp;Xin Jiang ,&nbsp;Hao Wu ,&nbsp;Xi Chen ,&nbsp;Min Liu","doi":"10.1016/j.asoc.2025.113389","DOIUrl":"10.1016/j.asoc.2025.113389","url":null,"abstract":"<div><div>Monocular 3D object detection plays a pivotal role in vehicle perception systems. Current methods frequently struggle to effectively extract scene-level semantic information, and the availability of monocular 3D detectors tailored to diverse embedded devices with varying computing power may still be limited. This paper introduces MonoYolo, a scalable detector designed for practicality and efficiency with varying resource constraints. In particular, we design a Superpixel Feature Pyramid Network (SFPN) that automatically groups pixels with similar attributes together. Experimental results on KITTI and nuScenes datasets showcase the advantageous performance of MonoYolo over superior monocular detectors for large models, while the lightweight model maintains real-time detection capabilities. Meanwhile, the proposed SFPN offers a seamless integration into existing image-only 3D detectors, presenting a plug-and-play solution for enhanced monocular 3D object detection performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"180 ","pages":"Article 113389"},"PeriodicalIF":7.2,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144261981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信