{"title":"RTL-Net: real-time lightweight Urban traffic object detection algorithm","authors":"Zhiqing Cui, Jiahao Yuan, Haibin Xu, Yamei Wei, Zhenglong Ding","doi":"10.1007/s40747-025-01875-z","DOIUrl":"https://doi.org/10.1007/s40747-025-01875-z","url":null,"abstract":"<p>Object detection algorithm in urban traffic using remote sensing images often suffers from high complexity, low real-time performance, and low accuracy. To address these challenges, we propose RTL-Net, an urban traffic object detection network structure based on You Only Look Once (YOLO) v8s. To enhance real-time performance beyond the benchmark, we implemented lightweight designs for the loss function, backbone, neck, and head components. Firstly, a Powerable-IoU (PIoU) loss function was introduced to make the algorithm more suitable for different scales of targets and reduce false detection. Secondly, the Lightweight Shared Convolutional Detection (LSCD) head was replaced to ensure the detection performance and significantly improve the lightweight performance of the algorithm. Additionally, this paper introduces the Dilatation-wise Residual (DWR) module to facilitate the algorithm’s extraction of detailed features. In addition, we optimize the Bidirectional Feature Pyramid Network (Bi-FPN), enabling the fusion of multiple features to improve overall feature integration and performance. The VisDrone2021 dataset was utilized for experimental training. Experimental results demonstrate that the proposed algorithm achieves a significant 43.9% reduction in parameters and an 18.9% decrease in computational complexity. Moreover, the detection accuracy has improved by 2.3%, while maintaining a real-time detection speed of 263.2 frames per second. For edge computing object detection, our method outperforms YOLOv8s and leading remote sensing algorithms in both speed and accuracy, achieving state-of-the-art performance among single-stage detectors with comparable parameters.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"45 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144137184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanming Ye, Qiang Sun, Kailong Cheng, Xingfa Shen, Dongjing Wang
{"title":"A lightweight mechanism for vision-transformer-based object detection","authors":"Yanming Ye, Qiang Sun, Kailong Cheng, Xingfa Shen, Dongjing Wang","doi":"10.1007/s40747-025-01904-x","DOIUrl":"https://doi.org/10.1007/s40747-025-01904-x","url":null,"abstract":"<p>DETR (DEtection TRansformer) is a CV model for object detection that replaces traditional complex methods with a Transformer architecture, and has achieved significant improvement over previous methods, particularly in handling small and medium-sized objects. However, the attention mechanism-based detection framework of DETR exhibits limitations in small and medium-sized object detection. It struggles to extract fine-grained details of small and medium-sized objects from low-resolution features, and its computational complexity increases significantly with the input scale, thereby constraining real-time detection efficiency. To address these limitations, we introduce the Cross Feature Attention (XFA) mechanism and propose XFCOS (XFA-based with FCOS), a novel object detection model built upon it. XFA simplifies the attention mechanism’s computational process and reduces complexity through L2 normalization and two one-dimensional convolutions applied in different directions. This design reduces the computational complexity from quadratic to linear while preserving spatial context awareness. XFCOS enhances the original TSP-FCOS (Transformer-based Set Prediction with FCOS) model by integrating XFA into the transformer encoder, creating a CNN-ViT hybrid architecture, significantly reducing computational costs without sacrificing accuracy. Extensive experiments demonstrate that XFCOS achieves state-of-the-art performance while addressing DETR’s convergence and efficiency limitations. On Pascal VOC 2007, XFCOS attains 54.7 AP and 60.7 AP<span>(_textrm{75})</span> - surpassing DETR by 4.5 AP and 7.9 AP<span>(_textrm{75})</span> respectively, establishing new benchmarks among ResNet-50-based detectors. The model shows particular strength in small object detection, achieving 24.0 AP<span>(_textrm{S})</span> and 43.9 AP<span>(_textrm{M})</span> on COCO 2017, representing 3.3 AP<span>(_textrm{S})</span> and 3.8 AP<span>(_{textrm{M}})</span> improvements over DETR. Through computational optimization, XFCOS reduces encoder FLOPs to 13.5G, representing a 17.2% decrease compared to TSP-FCOS’s 16.3G, while cutting activation memory from 285.78 to 264.64M, a reduction of 7.4%. This significantly enhances computational efficiency.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"89 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contrastive learning of cross-modal information enhancement for multimodal fake news detection","authors":"Weijie Chen, Fei Cai, Yupu Guo, Zhiqiang Pan, Wanyu Chen, Yijia Zhang","doi":"10.1007/s40747-025-01919-4","DOIUrl":"https://doi.org/10.1007/s40747-025-01919-4","url":null,"abstract":"<p>With the rapid development of the Internet, the existence of fake news and its rapid spread has brought many negative effects to the society. Consequently, the fake news detection task has become increasingly important over the past few years. Existing methods are predominantly unimodal methods or the multimodal representation of unimodal fusion for fake news detection. However, the large number of model parameters and the interference of noisy data increase the risk of overfitting. Thus, we construct an information enhancement and contrast learning framework by introducing Improved Low-rank Multimodal Fusion approach for Fake News Detection (ILMF-FND), which aims to reduce the noise interference and achieve efficient fusion of multimodal feature vectors with fewer parameters. In detail, an encoder extracts the feature vectors of text and images, which are subsequently refined using the Multi-gate Mixture-of-Experts. The refined features are mapped into the same space for semanteme sharing. Then, a cross-modal fusion is performed, resulting in that an efficient and highly precision fusion of text and image features is done with fewer parameters. Besides, we design an adaptive mechanism that can adjust the weights of the final components according to the modal fitness before inputting them into the classifier to achieve the best detection results in the current state. We evaluate the performance of ILMF-FND and the competitive baselines on two public datasets, i.e., Twitter and Weibo. The results indicate that our ILMF-FND greatly minimizes the number of parameters while outperforming the best baseline in terms of accuracy by 0.2% and 1.1% on the Weibo and Twitter datasets, respectively.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144113744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A classifier-assisted evolutionary algorithm with knowledge transfer for expensive multitasking problems","authors":"Min Hu, Zhigang Ren, Zhirui Cao, Yifeng Guo, Haitao Sun, Hongyao Zhou, Yu Guo","doi":"10.1007/s40747-025-01908-7","DOIUrl":"https://doi.org/10.1007/s40747-025-01908-7","url":null,"abstract":"<p>Surrogate-assisted evolutionary algorithms provide an effective means for complex and computationally expensive optimization problems. However, due to the scarcity of training samples, the prediction accuracy of frequently-used regression surrogate models can hardly be guaranteed as the difficulty of the problem increases, resulting in performance degradation of the whole algorithm. Since real-world problems rarely exist in isolation, it is expected to alleviate the above issue by properly exploiting the knowledge shared across different problems. In this context, this study proposes a novel evolutionary multitasking optimization algorithm assisted by a classifier rather than a regression model for expensive multitasking problems, where the accuracy of the classifier is boosted by knowledge transfer. To be specific, a support vector classifier (SVC) is first developed and integrated into a classic evolutionary algorithm, i.e., covariance matrix adaptation evolution strategy (CMA-ES). With a low computational cost, it helps CMA-ES to prescreen parent solutions from the current population. Following that, a knowledge transfer strategy is designed to enrich the training samples for each task-oriented classifier by sharing high-quality solutions among different tasks, where a PCA-based subspace alignment technique is employed. Extensive experiments indicate that the SVC-assisted CMA-ES gains significant superiority over general CMA-ES in terms of both robustness and scalability, and the knowledge transfer strategy further helps it earn a competitive edge over some state-of-the-art algorithms on expensive multitasking optimization problems.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"8 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Utilizing weak graph for edge consolidation-based efficient enhancement of network robustness","authors":"Wei Ding, Zhengdan Wang","doi":"10.1007/s40747-025-01922-9","DOIUrl":"https://doi.org/10.1007/s40747-025-01922-9","url":null,"abstract":"<p>Network robustness can be effectively augmented through edge safeguarding, especially when topology modification is not feasible. Although approximation algorithms are used due to the intrinsic hardness of problem, when the connectivity of the initial graph is adjusted to the desired value, the connectivity of the concealed weak graph is escalated to a maximum level. Consequently, a substantial amount of extra safeguarded edges are incorporated. To address this issue, we propose a novel concept called <i>K</i>-cut-segmentation that has never been used in any previous work. We then demonstrate that applying the <i>K</i>-cut-segmentation to the weak graph can bring connectivity of the weak graph back to the expected <i>K</i>. Thus, by consolidating fewer edges, the connectivity of the original graph can be maintained. The algorithm then extracts the weak graph and discovers a superior solution by constructing a partial minimum cost spanning tree. We compare the proposed algorithm with optimal and approximate algorithms across graphs of varying scales. The outcomes indicate that, for small graphs where the optimal algorithm is applicable, the algorithm achieves 100% consolidation efficacy. Solving speed is increased by up to 5 orders of magnitude, while only incurring an additional cost of approximately 3%. In large-scale graphs with one million nodes, under the same computational time, it can cut down on the consolidation cost by nearly 60% compared to existing algorithms, and the consolidation precision remains consistently high across different graph instances.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"32 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Path planning method for maritime dynamic target search based on improved GBNN","authors":"Zhaozhen Jiang, Xuehai Sun, Wenlon Wang, Shuzeng Zhou, Qiang Li, Lianglong Da","doi":"10.1007/s40747-025-01914-9","DOIUrl":"https://doi.org/10.1007/s40747-025-01914-9","url":null,"abstract":"<p>To address the issues of low discovery probability, inefficient search, and antagonistic targets during the process of dynamic target search in the ocean, a dynamic target search path planning method based on the Glasius biologically-inspired neural network (GBNN) in combination with marine environmental information is proposed. Firstly, the motion model of the searcher and the capability model of sonar detection are established, and the dynamic motion characteristics of the target are analyzed. The Beta distribution is employed to characterize the variation of the target velocity, and the distribution probability map of the target position alterations over time is obtained. Then GBNN is presented and the marine environment information is integrated to enhance the calculation approach of the internal connection weights of the network. Moreover, the update rule of the activity value of the neural network is reconfigured. The influence of the peak of the dynamic target distribution probability on the activity value of the neuron is regarded as the external incentive element. According to the turning limitation of the searcher and the activity of GBNN neurons, the search path points are determined smoothly. The paper's algorithm, validated through 10,000 Monte Carlo simulations with real maritime data, significantly outperforms traditional search methods in the discovery probability and search efficiency.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"55 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shenjie Zou, Jin Liu, Xiliang Zhang, Zhongdai Wu, Jing Liu, Bing Han
{"title":"Joint feature representation optimization and anti-occlusion for robust multi-vessel tracking in inland waterways","authors":"Shenjie Zou, Jin Liu, Xiliang Zhang, Zhongdai Wu, Jing Liu, Bing Han","doi":"10.1007/s40747-025-01918-5","DOIUrl":"https://doi.org/10.1007/s40747-025-01918-5","url":null,"abstract":"<p>Multiple vessel tracking plays a vital role in maritime surveillance systems. Previous studies have typically integrated object detection and trajectory association techniques to address this problem, but they still face some significant challenges. On one hand, these methods are susceptible to losing tracked targets due to long-term occlusion by other obstacles or slow-moving vessels in inland waterways. Moreover, traditional models encounter difficulties in accurately capturing the global appearance features of the vessels in images, which leads to a decline in vessel detection performance. To address the issues above, this paper proposes a novel Vessel Status Augmented Track (VSATrack) framework for multi-vessel detection and tracking. Specifically, we present a Motion-Matching Optimization Module (MMOM), which handles long-term occlusion through identity matching between consecutive frames. Besides, a vessel feature enhancement module (VFEM) with several residual convolutional layers and channel reconstruction units (CRU) is designed to effectively capture the vessels features in complex inland waterway backgrounds without introducing redundant channel information. Finally, a bidirectional feature pyramid network (BiFPN) is utilized to fuse vessel appearance features from different scales, enhancing the capability to learn cross-scale features of vessels to some extent. Experimental results demonstrate that our VSATrack method outperforms the state-of-the-art methods, particularly in reducing the number of vessel ID switches (IDSW).</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"55 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sayyed Mudassar Shah, Gan Zengkang, Zhaoyun Sun, Tariq Hussain, Khalid Zaman, Abdullah Alwabli, Amar Y. Jaffar, Farman Ali
{"title":"AI-enabled driver assistance: monitoring head and gaze movements for enhanced safety","authors":"Sayyed Mudassar Shah, Gan Zengkang, Zhaoyun Sun, Tariq Hussain, Khalid Zaman, Abdullah Alwabli, Amar Y. Jaffar, Farman Ali","doi":"10.1007/s40747-025-01897-7","DOIUrl":"https://doi.org/10.1007/s40747-025-01897-7","url":null,"abstract":"<p>This paper introduces a real-time head-pose detection and eye-gaze estimation system for Automatic Driver Assistance Technology (ADAT) aimed at enhancing driver safety by accurately collecting and transmitting data on the driver’s head position and eye gaze to mitigate potential risks. Existing methods are constrained by significant limitations, including reduced accuracy under challenging conditions such as varying head orientations and lighting, higher latency in real-time applications (e.g., Faster-RCNN and TPH-YOLOv5), and computational inefficiency, which hinders their deployment in resource-constrained environments. To address these challenges, we propose a novel framework using the Transformer Detection of Gaze Head - YOLOv7 (TDGH-YOLOv7) object detector. The key contributions of this work include the development of a reference image dataset encompassing diverse vertical and horizontal gaze positions alongside the implementation of an optimized detection system that achieves state-of-the-art performance in terms of accuracy and latency. The proposed system achieves superior precision, with a weighted accuracy of 95.02% and Root Mean Square Errors of 2.23 and 1.68 for vertical and horizontal gaze estimation, respectively, validated on the MPII-Gaze and DG-Unicamp datasets. A comprehensive comparative analysis with existing models, such as CNN, SSD, Faster-RCNN, and YOLOv8, underscores the robustness and efficiency of the proposed approach. Finally, the implications of these findings are discussed, and potential avenues for future research are outlined.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"133 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GCN and GAT-based interpretable knowledge tracing model","authors":"Yujia Huo, Menghong He, Xue Tan, Kesha Chen","doi":"10.1007/s40747-025-01921-w","DOIUrl":"https://doi.org/10.1007/s40747-025-01921-w","url":null,"abstract":"<p>Knowledge tracing (KT) aims to predict students’ future performance by assessing their level of knowledge mastery from past problem-solving records. However, many existing methods fail to take full advantage of the potential relationship between questions and skills, or fail to effectively utilize students’ historical learning data, which makes it difficult to accurately capture individualized mastery for each question. In addition, redundancy in long sequential information often leads to model overfitting, and existing deep knowledge tracing models have significant limitations in terms of the interpretability of their predictions. To address these issues, we propose GCAKT, an interpretable KT model focuses on student problem mastery. GCAKT generates personalized problem representations by modeling students’ historical learning information at a fine-grained level, and learns these representations jointly through a problem-skill embedding module and a personalized problem mastery module. To cope with the redundancy generated by long sequences, GCAKT employs an attention-based knowledge evolution module that constructs a final hidden knowledge state by analyzing the attention relationship between the student’s hidden knowledge state and the problem at each point in time. Meanwhile, GCAKT utilizes the attention weights to construct interpretable paths, aiding to provide interpretable prediction results. Experimental results on three publicly available real-world educational datasets show that GCAKT outperforms traditional methods in terms of both prediction accuracy and interpretability.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"150 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiobjective integrated scheduling of disassembly and reprocessing operations considering product structures and stochastic processing time via reinforcement learning-based evolutionary algorithms","authors":"Yaping Fu, Fuquan Wang, Zhengyuan Li, Guangdong Tian, Duc Truong Pham, Hao Sun","doi":"10.1007/s40747-025-01907-8","DOIUrl":"https://doi.org/10.1007/s40747-025-01907-8","url":null,"abstract":"<p>Remanufacturing has become a mainstream sustainable manufacturing paradigm for energy conservation and environmental protection. Disassembly and reprocessing operations are two main activities in remanufacturing. This work proposes multiobjective integrated scheduling of disassembly and reprocessing operations considering product structures and random processing time. First, a stochastic programming model is developed to minimize maximum completion time and total tardiness. Second, a reinforcement learning-based multiobjective evolutionary algorithm is devised considering problem-specific knowledge. Three search strategy combinations are formed: crossover and mutation, crossover and key product-based iterated local search, mutation and key product-based iterated local search. At each iteration, a Q-learning method is devised to intelligently choose a combination of premium strategies. A stochastic simulation is incorporated to evaluate the objective values of the searched solutions. Finally, the formulated model and method are compared with an exact solver, CPLEX, and three well-known metaheuristics from the literature on a set of test instances. The results confirm the excellent competitiveness of the developed model and algorithm for solving the considered problem.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}