Fernando Martínez-Plumed, Gonzalo Jaimovitch-López, Cèsar Ferri, María José Ramírez-Quintana, José Hernández-Orallo
{"title":"A general supply-inspect cost framework to regulate the reliability-usability trade-offs for few-shot inference","authors":"Fernando Martínez-Plumed, Gonzalo Jaimovitch-López, Cèsar Ferri, María José Ramírez-Quintana, José Hernández-Orallo","doi":"10.1007/s40747-024-01599-6","DOIUrl":"https://doi.org/10.1007/s40747-024-01599-6","url":null,"abstract":"<p>Language models and other recent machine learning paradigms blur the distinction between generative and discriminative tasks, in a continuum that is regulated by the degree of pre- and post-supervision that is required from users, as well as the tolerated level of error. In few-shot inference, we need to find a trade-off between the number and cost of the solved examples that have to be supplied, those that have to be inspected (some of them accurate but others needing correction) and those that are wrong but pass undetected. In this paper, we define a new Supply-Inspect Cost Framework, associated graphical representations and comprehensive metrics that consider all these elements. To optimise few-shot inference under specific operating conditions, we introduce novel algorithms that go beyond the concept of rejection rules in both static and dynamic contexts. We illustrate the effectiveness of all these elements for a transformative domain, data wrangling, for which language models can have a huge impact if we are able to properly regulate the reliability-usability trade-off, as we do in this paper.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"12 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid attentive prototypical network for few-shot action recognition","authors":"Zanxi Ruan, Yingmei Wei, Yanming Guo, Yuxiang Xie","doi":"10.1007/s40747-024-01571-4","DOIUrl":"https://doi.org/10.1007/s40747-024-01571-4","url":null,"abstract":"<p>Most previous few-shot action recognition works tend to process video temporal and spatial features separately, resulting in insufficient extraction of comprehensive features. In this paper, a novel hybrid attentive prototypical network (HAPN) framework for few-shot action recognition is proposed. Distinguished by its joint processing of temporal and spatial information, the HAPN framework strategically manipulates these dimensions from feature extraction to the attention module, consequently enhancing its ability to perform action recognition tasks. Our framework utilizes the R(2+1)D backbone network, coupling the extraction of integrated temporal and spatial features to ensure a comprehensive understanding of video content. Additionally, our framework introduces the novel Residual Tri-dimensional Attention (ResTriDA) mechanism, specifically designed to augment feature information across the temporal, spatial, and channel dimensions. ResTriDA dynamically enhances crucial aspects of video features by amplifying significant channel-wise features for action distinction, accentuating spatial details vital for capturing the essence of actions within frames, and emphasizing temporal dynamics to capture movement over time. We further propose a prototypical attentive matching module (PAM) built on the concept of metric learning to resolve the overfitting issue common in few-shot tasks. We evaluate our HAPN framework on three classical few-shot action recognition datasets: Kinetics-100, UCF101, and HMDB51. The results indicate that our framework significantly outperformed state-of-the-art methods. Notably, the 1-shot task, demonstrated an increase of 9.8% in accuracy on UCF101 and improvements of 3.9% on HMDB51 and 12.4% on Kinetics-100. These gains confirm the robustness and effectiveness of our approach in leveraging limited data for precise action recognition.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"7 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GVP-RRT: a grid based variable probability Rapidly-exploring Random Tree algorithm for AGV path planning","authors":"Yaozhe Zhou, Yujun Lu, Liye Lv","doi":"10.1007/s40747-024-01576-z","DOIUrl":"https://doi.org/10.1007/s40747-024-01576-z","url":null,"abstract":"<p>In response to the issues of low solution efficiency, poor path planning quality, and limited search completeness in narrow passage environments associated with Rapidly-exploring Random Tree (RRT), this paper proposes a Grid-based Variable Probability Rapidly-exploring Random Tree algorithm (GVP-RRT) for narrow passages. The algorithm introduced in this paper preprocesses the map through gridization to extract features of different path regions. Subsequently, it employs random growth with variable probability density based on the features of path regions using various strategies based on grid, probability, and guidance to enhance the probability of growth in narrow passages, thereby improving the completeness of the algorithm. Finally, the planned route is subjected to path re-optimization based on the triangle inequality principle. The simulation results demonstrate that the planning success rate of GVP-RRT in complex narrow channels is increased by 11.5–69.5% compared with other comparative algorithms, the average planning time is reduced by more than 50%, and the GVP-RRT has a shorter average planning path length.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"8 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142002825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yangyue Feng, Xiaokang Yang, Yong Li, Lijuan Zhang, Yan Lv, Jinfang Jin
{"title":"TSKPD: twin structure key point detection in point cloud","authors":"Yangyue Feng, Xiaokang Yang, Yong Li, Lijuan Zhang, Yan Lv, Jinfang Jin","doi":"10.1007/s40747-024-01593-y","DOIUrl":"https://doi.org/10.1007/s40747-024-01593-y","url":null,"abstract":"<p>The point cloud keypoint detection algorithm like USIP that uses downsampling first and then fine-tuning the sampling points cannot effectively detect the defect part of the single view defect point cloud, resulting in the inability to output the keypoints of the defect part. Therefore, this paper proposes the twin structure key point detection algorithm named TSKPD based on the idea of contrastive learning, which uses two single-view defect point clouds to synthesize relatively more complete key points for learning, so as to promote the network model to learn the features of the complete point cloud. The robustness of key point detection of point cloud is effectively improved, and the detection of single view defect point cloud is realized. The test results on ModelNet40 and ShapeNet datasets show that the coverage rate of TSKPD on the missing part of the single view defect point cloud is 12.62 higher than the existing optimal algorithm.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"6 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid neural combinatorial optimization framework assisted by automated algorithm design","authors":"Liang Ma, Xingxing Hao, Wei Zhou, Qianbao He, Ruibang Zhang, Li Chen","doi":"10.1007/s40747-024-01600-2","DOIUrl":"https://doi.org/10.1007/s40747-024-01600-2","url":null,"abstract":"<p>In recent years, the application of Neural Combinatorial Optimization (NCO) techniques in Combinatorial Optimization (CO) has emerged as a popular and promising research direction. Currently, there are mainly two types of NCO, namely, the Constructive Neural Combinatorial Optimization (CNCO) and the Perturbative Neural Combinatorial Optimization (PNCO). The CNCO generally trains an encoder-decoder model via supervised learning to construct solutions from scratch. It exhibits high speed in construction process, however, it lacks the ability for sustained optimization due to the one-shot mapping, which bounds its potential for application. Instead, the PNCO generally trains neural network models via deep reinforcement learning (DRL) to intelligently select appropriate human-designed heuristics to improve existing solutions. It can achieve high-quality solutions but at the cost of high computational demand. To leverage the strengths of both approaches, we propose to hybrid the CNCO and PNCO by designing a hybrid framework comprising two stages, in which the CNCO is the first stage and the PNCO is the second. Specifically, in the first stage, we utilize the attention model to generate preliminary solutions for given CO instances. In the second stage, we employ DRL to intelligently select and combine appropriate algorithmic components from improvement pool, perturbation pool, and prediction pool to continuously optimize the obtained solutions. Experimental results on synthetic and real Capacitated Vehicle Routing Problems (CVRPs) and Traveling Salesman Problems(TSPs) demonstrate the effectiveness of the proposed hybrid framework with the assistance of automated algorithm design.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"47 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141994529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Location-routing optimization of UAV collaborative blood delivery vehicle distribution on complex roads","authors":"Zhiyi Meng, Ke Yu, Rui Qiu","doi":"10.1007/s40747-024-01591-0","DOIUrl":"https://doi.org/10.1007/s40747-024-01591-0","url":null,"abstract":"<p>To address the protracted blood transportation time prevalent in contemporary urban settings, we proposed a location-routing optimization problem tailored to the distribution of blood within intricate road networks. This involved a comprehensive assessment that encompassed the judicious selection of sites for both stations and blood centers, coupled with the meticulous planning of delivery routes for unmanned aerial vehicles (UAVs) that orchestrate the transportation of blood. First, a model was formulated to minimize the overall cost, including transportation expenses, costs associated with the site, and other relevant costs related to blood transportation vehicles coordinated by UAVs. Subsequently, a two-stage hybrid heuristic algorithm was designed based on the distinctive characteristics of the problem at hand. Moreover, an enhanced k-means algorithm was employed to generate clustering schemes, utilizing the centroid method to address the challenge of location selection for delivery sites effectively. A genetic algorithm enhanced with adaptive operators was employed to address the challenging large-scale NP-hard problem associated with route planning in intricate urban road networks. The results indicated that, compared to the traditional blood delivery model using vehicles, the total blood transportation cost decreased by 12.65% and the overall delivery time was reduced by 37.5% with the adoption of drone-assisted delivery; ultimately, case and sensitivity analyses were conducted to investigate the impact of variables including the number of blood transportation vehicles, UAVs, driver wages, and unit costs of blood transportation vehicles on the location-routing problem.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"21 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuanbo Wen, Xianbin Wu, Zidong Wang, Weibo Liu, Junjie Yang
{"title":"A novel local feature fusion architecture for wind turbine pitch fault diagnosis with redundant feature screening","authors":"Chuanbo Wen, Xianbin Wu, Zidong Wang, Weibo Liu, Junjie Yang","doi":"10.1007/s40747-024-01584-z","DOIUrl":"https://doi.org/10.1007/s40747-024-01584-z","url":null,"abstract":"<p>The safe and reliable operation of the pitch system is essential for the stable and efficient operation of a wind turbine (WT). The pitch fault data collected by supervisory control and data acquisition systems (SCADA) often contain a wide variety of variables, leading to redundant features that interfere with the accuracy of final diagnosis results, making it difficult to meet requirements. Also, the problem of extracting only local features while ignoring global information is present in the feature extraction process using the deep Convolutional Neural Network (CNN) model. To address these issues, the global average correlation coefficient is proposed in this article to measure the correlation between multiple variables in SCADA data. By considering the correlation among multiple variables comprehensively, redundant features are effectively eliminated, enhancing the accuracy of fault diagnosis. Furthermore, a new local amplification fusion architecture network (LAFA-Net) based on multi-head attention (MHA) is introduced. An efficient local feature extraction module, designed to enhance the model’s perception of detailed features while maintaining global context information, is first introduced. LAFA-Net integrates the advantages of CNN and MHA, efficiently extracting and fusing valuable features from filtered data for both local and global aspects. Experiments on real pitch fault data demonstrate that the global average correlation coefficient effectively screens out redundant features in the dataset that negatively impact fault diagnosis results, thereby improving diagnosis efficiency and accuracy. The LAFA-Net model, capable of accurately diagnosing multiple types of pitch faults, shows a superior classification effect and accuracy compared to several advanced models, along with a faster convergence speed.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyan Li, Shenghua Xu, Hengxu Jin, Zhuolu Wang, Yu Ma, Xuan He
{"title":"POI recommendation by deep neural matrix factorization integrated attention-aware meta-paths","authors":"Xiaoyan Li, Shenghua Xu, Hengxu Jin, Zhuolu Wang, Yu Ma, Xuan He","doi":"10.1007/s40747-024-01596-9","DOIUrl":"https://doi.org/10.1007/s40747-024-01596-9","url":null,"abstract":"<p>With the continuous accumulation of massive amounts of mobile data, point-of-interest (POI) recommendation has become a vital task for location-based social networks. Deep neural networks or matrix factorization (MF) alone are challenging to effectively learn user–POI interaction functions. Moreover, the user–POI interaction matrix is sparse, and the heterogeneous characteristics of auxiliary information are underused. Therefore, we propose an innovative POI recommendation method that integrates attention-aware meta-paths based on deep neural matrix factorization (DNMF-AM). First, we develop a multi-relational heterogeneous information network of “user–POI–geographic region–POI category.” Multiple-weighted isomorphic information networks based on meta-paths are employed to obtain node-embedding vectors across different relationships. Attention networks are employed to aggregate node vectors across various relationships and serve as auxiliary information to mitigate the challenges of data sparsity. Subsequently, the internal embedding vectors of the users and POIs are extracted using feature embedding based on the user–POI interaction matrix. Second, these vectors are integrated with the embedding vectors obtained by aggregating the attention networks. Third, deep neural matrix factorization is used to learn linear and nonlinear user–POI interactions to mitigate the implicit feedback problem. This outcome is achieved using generalized matrix factorization and convolution-constrained multi-head self-attention mechanism deep neural networks. Extensive experiments conducted on two real-world datasets demonstrate that the DNMF-AM outperforms the optimal baseline NeuMF-CAA by 4.24% and 5.04% in terms of HR@10 and NDCG@10, respectively.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"18 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guanfeng Yu, Lei Zhang, Siyuan Shen, Zhengjun Zhai
{"title":"Real-time vision-inertial landing navigation for fixed-wing aircraft with CFC-CKF","authors":"Guanfeng Yu, Lei Zhang, Siyuan Shen, Zhengjun Zhai","doi":"10.1007/s40747-024-01579-w","DOIUrl":"https://doi.org/10.1007/s40747-024-01579-w","url":null,"abstract":"<p>Vision-inertial navigation offers a promising solution for aircraft to estimate ego-motion accurately in environments devoid of Global Navigation Satellite System (GNSS). However, existing approaches have limited adaptability for fixed-wing aircraft with high maneuverability and insufficient visual features, problems of low accuracy and subpar real-time arise. This paper introduces a novel vision-inertial heterogeneous data fusion methodology, aiming to enhance the navigation accuracy and computational efficiency of fixed-wing aircraft landing navigation. The visual front-end of the system extracts multi-scale infrared runway features and computes geo-reference runway image as observation. The infrared runway features are recognized efficiently and robustly by a lightweight end-to-end neural network from blurry infrared images, and the geo-reference runway is generated through projection of the runway’s prior geographical information and prior pose. The fusion back-end of the navigation system is the Covariance Feedback Control based Cubature Kalman Filter (CFC-CKF) framework, which tightly integrates visual observations and inertial measurements for zero-drift pose estimation and curbs the effect of inaccurate kinematic noise statistics. Finally, real flight experiments demonstrate that the algorithm can estimate the pose at a frequency of 100 Hz and fulfill the navigation accuracy requirements for high-speed landing of fixed-wing aircraft.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"30 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SDGSA: a lightweight shallow dual-group symmetric attention network for micro-expression recognition","authors":"Zhengyang Yu, Xiaojuan Chen, Chang Qu","doi":"10.1007/s40747-024-01594-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01594-x","url":null,"abstract":"<p>Recognizing micro-expressions (MEs) as subtle and transient forms of human emotional expressions is critical for accurately judging human feelings. However, recognizing MEs is challenging due to their transient and low-intensity characteristics. This study develops a lightweight shallow dual-group symmetric attention network (SDGSA) to address the limitations of existing methods in capturing the subtle features of MEs. This network takes the optical flow features as inputs, extracting ME features through a shallow network and performing finer feature segmentation in the channel dimension through a dual-group strategy. The goal is to focus on different types of facial information without disrupting facial symmetry. Moreover, this study implements a spatial symmetry attention module, focusing on extracting facial symmetry features to emphasize further the symmetric information of the left and right sides of the face. Additionally, we introduce the channel blending technique to optimize the information fusion between different channel features. Extensive experiments on SMIC, CASME II, SAMM, and 3DB-combined mainstream ME datasets demonstrate that the proposed SDGSA method outperforms the metrics of current state-of-the-art methods. As shown by ablation experimental results, the proposed dual-group symmetric attention module outperforms classical attention modules, such as the convolutional block attention module, squeeze-and-excitation, efficient channel attention, spatial group-wise enhancement, and multi-head self-attention. Importantly, SDGSA maintained excellent performance while having only 0.278 million parameters. The code and model are publicly available at https://github.com/YZY980123/SDGSA.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"9 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141980950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}