{"title":"Bridging the gap: multi-granularity representation learning for text-based vehicle retrieval","authors":"Xue Bo, Junjie Liu, Di Yang, Wentao Ma","doi":"10.1007/s40747-024-01614-w","DOIUrl":"https://doi.org/10.1007/s40747-024-01614-w","url":null,"abstract":"<p>Text-based cross-modal vehicle retrieval has been widely applied in smart city contexts and other scenarios. The objective of this approach is to identify semantically relevant target vehicles in videos using text descriptions, thereby facilitating the analysis of vehicle spatio-temporal trajectories. Current methodologies predominantly employ a two-tower architecture, where single-granularity features from both visual and textual domains are extracted independently. However, due to the intricate semantic relationships between videos and text, aligning the two modalities effectively using single-granularity feature representation poses a challenge. To address this issue, we introduce a <b>M</b>ulti-<b>G</b>ranularity <b>R</b>epresentation <b>L</b>earning model, termed <b>MGRL</b>, tailored for text-based cross-modal vehicle retrieval. Specifically, the model parses information from the two modalities into three hierarchical levels of feature representation: coarse-granularity, medium-granularity, and fine-granularity. Subsequently, a feature adaptive fusion strategy is devised to automatically determine the optimal pooling mechanism. Finally, a multi-granularity contrastive learning approach is implemented to ensure comprehensive semantic coverage, ranging from coarse to fine levels. Experimental outcomes on public benchmarks show that our method achieves up to a 14.56% improvement in text-to-vehicle retrieval performance, as measured by the Mean Reciprocal Rank (MRR) metric, when compared against 10 state-of-the-art baselines and 6 ablation studies.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"32 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuguang Wu, Yiliang Han, Minqing Zhang, Yu Li, Su Cui
{"title":"GAN-based pseudo random number generation optimized through genetic algorithms","authors":"Xuguang Wu, Yiliang Han, Minqing Zhang, Yu Li, Su Cui","doi":"10.1007/s40747-024-01606-w","DOIUrl":"https://doi.org/10.1007/s40747-024-01606-w","url":null,"abstract":"<p>Pseudo-random number generators (PRNGs) are deterministic algorithms that generate sequences of numbers approximating the properties of random numbers, which are widely utilized in various fields. In this paper, we present a Genetic Algorithm Optimized Generative Adversarial Network (hereinafter referred to as GAGAN), which is designed for the effective training of discrete generative adversarial networks. In situations where non-differentiable activation functions, such as the modulo operation, are employed and traditional gradient-based backpropagation methods are inapplicable, genetic algorithms are utilized to optimize the parameters of the generator network. Based on this framework, we propose a novel recursive PRNG. Given that a PRNG can be constructed from one-way functions and their associated hardcore predicates, our proposed generator consists of two neural networks that simulate these functions and serve as the state transition function and the output function, respectively. The proposed PRNG has been rigorously tested using stringent benchmarks such as the NIST Statistical Test Suite (SP800-22) and the Chinese standard for random number generation (GM/T 0005-2021). Additionally, it has demonstrated outstanding performance in terms of Hamming distance. The results indicate that the proposed GAN-based PRNG has achieved a high degree of randomness and is highly sensitive to variations in the input.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"34 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pengyun Hu, Xianpiao Tang, Liu Yang, Chuijian Kong, Daoxun Xia
{"title":"LCANet: a model for analysis of students real-time sentiment by integrating attention mechanism and joint loss function","authors":"Pengyun Hu, Xianpiao Tang, Liu Yang, Chuijian Kong, Daoxun Xia","doi":"10.1007/s40747-024-01608-8","DOIUrl":"https://doi.org/10.1007/s40747-024-01608-8","url":null,"abstract":"<p>By recognizing students’ facial expressions in actual classroom situations, the students’ emotional states can be quickly uncovered, which can help teachers grasp the students’ learning rate, which allows teachers to adjust their teaching strategies and methods, thus improving the quality and effectiveness of classroom teaching. However, most previous facial expression recognition methods have problems such as missing key facial features and imbalanced class distributions in the dateset, resulting in low recognition accuracy. To address these challenges, this paper proposes LCANet, a model founded on a fused attention mechanism and a joint loss function, which allows the recognition of students’ emotions in real classroom scenarios. The model uses ConvNeXt V2 as the backbone network to optimize the global feature extraction capability of the model, and at the same time, it enables the model to pay closer attention to the key regions in facial expressions. We incorporate an improved Channel Spatial Attention (CSA) module as a way to extract more local feature information. Furthermore, to mitigate the class distribution imbalance problem in the facial expression dataset, we introduce a joint loss function. The experimental results show that our LCANet model has good recognition rates on both the public emotion datasets FERPlus, RAF-DB and AffectNet, with accuracies of 91.43%, 90.03% and 64.43%, respectively, with good robustness and generalizability. Additionally, we conducted experiments using the model in real classroom scenarios, detecting and accurately predicting students’ classroom emotions in real time, which provides an important reference for improving teaching in smart teaching scenarios.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"156 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEDSFAN: information enhancement and dynamic-static fusion attention network for traffic flow forecasting","authors":"Lianfei Yu, Ziling Wang, Wenxi Yang, Zhijian Qu, Chongguang Ren","doi":"10.1007/s40747-024-01663-1","DOIUrl":"https://doi.org/10.1007/s40747-024-01663-1","url":null,"abstract":"<p>Accurate forecasting of traffic flow in the future period is very important for planning traffic routes and alleviating traffic congestion. However, traffic flow forecasting still faces serious challenges. Most of the existing traffic flow forecasting methods are static graph convolutional networks based on prior knowledge, ignoring the special spatial–temporal dynamics of spatial–temporal data. Using only adaptive dynamic graphs completely discards the objective and real spatial connectivity information in static graphs. To this end, we propose a novel information enhancement and dynamic-static fusion attention network (IEDSFAN). Firstly, the Multi-Graph Fusion Gating mechanism (MGFG) designed in IEDSFAN effectively fuses dynamic and static graphs to dynamically capture the hidden spatial–temporal correlation. Secondly, we construct a novel Gated Multi-head Self-Attention (GMHSA), which maps the input through the MGFG module to capture the complex spatial–temporal interactions in the features. Finally, we generate adaptive parameters to solve the problem that shared parameters cannot learn multiple traffic patterns, and enhance the expression of sequence information through the peak flag module. We conducted extensive experiments on five real-world traffic datasets, and the experimental results show that the performance of IEDSFAN is significantly better than all baselines.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"37 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-task genetic programming approach for online multi-objective container placement in heterogeneous cluster","authors":"Ruochen Liu, Haoyuan Lv, Ping Yang, Rongfang Wang","doi":"10.1007/s40747-024-01605-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01605-x","url":null,"abstract":"<p>Owing to the potential for fast deployment, containerization technology has been widely used in web applications based on microservice architecture. Online container placement aims to improve resource utilization and meet other service quality requirements of cloud data centers. Most current heuristic and hyper-heuristic methods for container placement rely on single allocation rules, which are inefficient in heterogeneous cluster scenarios. Moreover, some container placement tasks often have similar characteristics (e.g., resource request types and physical machine types), but traditional single-task optimization modeling cannot exploit potential common knowledge, resulting in repeated optimization during resource allocation. Therefore, a new multi-task genetic programming method is proposed to solve the online multi-objective container placement problem (MOCP-MTGP). This method considers selecting appropriate allocation rules according to the types of resource requests and cluster status. MOCP-MTGP can automatically generate multiple groups of allocation rules from historical workload patterns and different cluster states, and capture the similarities between all online tasks to guide the transfer of general knowledge during optimization. Comprehensive experiments show that the proposed algorithm can improve the resource utilization of clusters, reduce the number of physical machines, and effectively meet resource constraints and high availability requirements.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"7 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142601187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards accurate anomaly detection for cloud system via graph-enhanced contrastive learning","authors":"Zhen Zhang, Zhe Zhu, Chen Xu, Jinyu Zhang, Shaohua Xu","doi":"10.1007/s40747-024-01659-x","DOIUrl":"https://doi.org/10.1007/s40747-024-01659-x","url":null,"abstract":"<p>As a critical technology, anomaly detection ensures the smooth operation of cloud systems while maintaining the market competitiveness of cloud service providers. However, the resource data in real-world cloud systems is predominantly unannotated, leading to insufficient supervised signals for anomaly detection. Moreover, complicated topological associations existed between cloud servers (e.g., computation, storage, and communication). While acquiring resource information, correlating the system topology is challenging. To this end, we propose the GCAD for cloud system anomaly detection, which integrates data augmentation, GraphGRU, contrastive learning, and reconstruction. First, GCAD constructs positive and negative sample pairs through the masking and Gaussian noise data augmentation. Then, the GraphGRU processes extended temporal graph data, extracting and fusing spatiotemporal features from resource status and system topology. In addition, GCAD introduces linear attention for encoding spatiotemporal representations to capture their global correlation information. The weight parameters of the encoder are optimized using a contrastive learning mechanism. Finally, GCAD utilizes a reconstruction technique to calculate anomaly scores, facilitating the evaluation of the state of the cloud system at each time point. Experimental results indicate that GCAD outperforms state-of-the-art compared methods on two real-world datasets that contain topology information.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"2 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142599225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A nonrevisiting genetic algorithm based on multi-region guided search strategy","authors":"Qijun Wang, Chunxin Sang, Haiping Ma, Chao Wang","doi":"10.1007/s40747-024-01627-5","DOIUrl":"https://doi.org/10.1007/s40747-024-01627-5","url":null,"abstract":"<p>Recently, nonrevisiting genetic algorithms have demonstrated superior capabilities compared with classic genetic algorithms and other single-objective evolutionary algorithms. However, the search efficiency of nonrevisiting genetic algorithms is currently low for some complex optimisation problems. This study proposes a nonrevisiting genetic algorithm with a multi-region guided search to improve the search efficiency. The search history is stored in a binary space partition (BSP) tree, where each searched solution is assigned to a leaf node and corresponds to a search region in the search space. To fully exploit the search history, several optimal solutions in the BSP tree are archived to represent the most potential search regions and estimate the fitness landscape in the search space. Except for the conventional genetic operations, the offspring can also be generated through multi-region guided search strategy, where the current solution is first navigated to one of the candidate search regions and is further updated towards the direction of the optimal solution in the search history to speedup convergence. Thus, multi-region guided search can reduce the possibility of getting trapped in local optima when solving problems with complex landscapes. The experimental results on different types of test suites reveal the competitiveness of the proposed algorithm in comparison with several state-of-the-art methods.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"29 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142599224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adversarial imitation learning with deep attention network for swarm systems","authors":"Yapei Wu, Tao Wang, Tong Liu, Zhicheng Zheng, Demin Xu, Xingguang Peng","doi":"10.1007/s40747-024-01662-2","DOIUrl":"https://doi.org/10.1007/s40747-024-01662-2","url":null,"abstract":"<p>Swarm systems consist of a large number of interacting individuals, which exhibit complex behavior despite having simple interaction rules. However, crafting individual motion policies that can manifest desired collective behaviors poses a significant challenge due to the intricate relationship between individual policies and swarm dynamics. This paper addresses this issue by proposing an imitation learning method, which derives individual policies from collective behavior data. The approach leverages an adversarial imitation learning framework, with a deep attention network serving as the individual policy network. Our method successfully imitates three distinct collective behaviors. Utilizing the ease of analysis provided by the deep attention network, we have verified that the individual policies underlying a certain collective behavior are not unique. Additionally, we have analyzed the different individual policies discovered. Lastly, we validate the applicability of the proposed method in designing policies for swarm robots through practical implementation on swarm robots.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"6 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142599228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatical sampling with heterogeneous corpora for grammatical error correction","authors":"Shichang Zhu, Jianjian Liu, Ying Li, Zhengtao Yu","doi":"10.1007/s40747-024-01653-3","DOIUrl":"https://doi.org/10.1007/s40747-024-01653-3","url":null,"abstract":"<p>Thanks to the strong representation capability of the pre-trained language models, supervised grammatical error correction has achieved promising performance. However, traditional model training depends significantly on the large scale of similar distributed samples. The model performance decreases sharply once the distributions of training and testing data are inconsistent. To address this issue, we propose an automatic sampling approach to effectively select high-quality samples from different corpora and filter out irrelevant or harmful ones. Concretely, we first provide a detailed analysis of error type and sentence length distributions on all datasets. Second, our corpus weighting approach is exploited to yield different weights for each sample automatically based on analysis results, thus emphasizing beneficial samples and ignoring the noisy ones. Finally, we enhance typical Seq2Seq and Seq2Edit grammatical error correction models with pre-trained language models and design a model ensemble algorithm for integrating the advantages of heterogeneous models and weighted samples. Experiments on the benchmark datasets demonstrate that the proper utilization of different corpora is extremely helpful in enhancing the accuracy of grammatical error correction. The detailed analysis gains more insights into the effect of different corpus weighting strategies.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"63 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142599227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unveiling user identity across social media: a novel unsupervised gradient semantic model for accurate and efficient user alignment","authors":"Yongqiang Peng, Xiaoliang Chen, Duoqian Miao, Xiaolin Qin, Xu Gu, Peng Lu","doi":"10.1007/s40747-024-01626-6","DOIUrl":"https://doi.org/10.1007/s40747-024-01626-6","url":null,"abstract":"<p>The field of social network analysis has identified User Alignment (UA) as a crucial area of investigation. The objective of UA is to identify and connect user accounts across diverse social networks, even when there are no explicit interconnections. UA plays a pivotal role in synthesising coherent user profiles and delving into the intricacies of user behaviour across platforms. However, traditional approaches have encountered limitations. Singular embedding techniques have been found to fall short in fully capturing the semantic essence of user profile attributes. Furthermore, classification-based embedding methods lack definitive criteria for categorisation, thereby constraining both the efficacy and applicability of these models. This paper presents a novel unsupervised Gradient Semantic Model for User Alignment (GSMUA) for the purpose of identifying common user identities across social networks. GSMUA categorises user profile information into weak, sub, and strong gradients based on the semantic intensity of attributes. Different gradient semantic levels direct attention to literal features, semantic features, or a combination of both during feature extraction, thereby achieving a full semantic representation of user attributes. In the case of strongly semantic long texts, GSMUA employs Named Entity Recognition (ENR) technology in order to enhance the inefficient handling of such texts. Furthermore, GSMUA compensates for missing user profile attributes by utilising profile information from user neighbours, thereby reducing the negative impact of missing user profile attributes on model performance. Extensive experiments conducted on four pairs of real datasets demonstrate the superiority of our approach. In comparison to the most effective previously developed unsupervised methods, GSMUA demonstrates improvements in hit-precision ranging from 5.32 to 12.17%. When compared to supervised methods, the improvements range from 0.71 to 11.79%.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"13 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142599226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}