{"title":"EU-Net: a segmentation network based on semantic fusion and edge guidance for road crack images","authors":"Jing Gao, Yiting Gui, Wen Ji, Jun Wen, Yueyu Zhou, Xiaoxiao Huang, Qiang Wang, Chenlong Wei, Zhong Huang, Chuanlong Wang, Zhu Zhu","doi":"10.1007/s10489-024-05788-1","DOIUrl":"10.1007/s10489-024-05788-1","url":null,"abstract":"<div><p>An enhanced U-shaped network (EU-Net) based on deep semantic information fusion and edge information guidance is studied to improve the segmentation accuracy of road cracks under hazy conditions. The EU-Net comprises multimode feature fusion, side information fusion and edge extraction modules. The feature and side information fusion modules are applied to fuse deep semantic information with multiscale features. The edge extraction module uses the Canny edge detection algorithm to guide and constrain crack edge information from the neural network. The experimental results show that the method in this work is superior to the most widely used crack segmentation methods. Compared with that of the baseline U-Net, the mIoU of the EU-Net increases by 0.59% and 5.7% on the Crack500 and Masonry datasets, respectively.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12949 - 12963"},"PeriodicalIF":3.4,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SCSformer: cross-variable transformer framework for multivariate long-term time series forecasting via statistical characteristics space","authors":"Yongfeng Su, Juhui Zhang, Qiuyue Li","doi":"10.1007/s10489-024-05764-9","DOIUrl":"10.1007/s10489-024-05764-9","url":null,"abstract":"<div><p>Deep learning-based models have emerged as promising tools for multivariate long-term time series forecasting. These models are finely structured to perform feature extraction from time series, greatly improving the accuracy of multivariate long-term time series forecasting. However, to the best of our knowledge, few scholars have focused their research on preprocessing time series, such as analyzing their periodic distributions or analyzing their values and volatility at the global level. In fact, properly preprocessing time series can often significantly improve the accuracy of multivariate long-term time series forecasting. In this paper, using the cross-variable transformer as a basis, we introduce a statistical characteristics space fusion module to preprocess the time series, this module takes the mean and standard deviation values of the time series during different periods as part of the model’s inputs and greatly improves the model’s performance. The Statistical Characteristics Space Fusion Module consists of a statistical characteristics space, which represents the mean and standard deviation values of a time series under different periods, and a convolutional neural network, which is used to fuse the original time series with the corresponding mean and standard deviation values. Moreover, to extract the linear dependencies of the time series variables more efficiently, we introduce three different linear projection layers at different nodes of the model, which we call the Multi-level Linear Projection Module. This new methodology, called <b>the SCSformer</b>, includes three innovations. First, we propose a Statistical Characteristics Space Fusion Module, which is capable of calculating the statistical characteristics space of the time series and fusing the original time series with a specific element of the statistical characteristics space as inputs of the model. Second, we introduce a Multi-level Linear Projection Module to capture linear dependencies of time series from different stages of the model. Third, we combine the Statistical Characteristics Space Fusion Module, the Multi-level Linear Projection Module, the Reversible Instance Normalization and the Cross-variable Transformer proposed in Client in a certain order to generate the SCSformer. We test this combination on nine real-world time series datasets and achieve optimal results on eight of them. Our code is publicly available at https://github.com/qiuyueli123/SCSformer.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12922 - 12948"},"PeriodicalIF":3.4,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Siam2C: Siamese visual segmentation and tracking with classification-rank loss and classification-aware","authors":"Bangjun Lei, Qishuai Ding, Weisheng Li, Hao Tian, Lifang Zhou","doi":"10.1007/s10489-024-05840-0","DOIUrl":"10.1007/s10489-024-05840-0","url":null,"abstract":"<div><p>Siamese visual trackers based on segmentation have garnered considerable attention due to their high accuracy. However, these trackers rely solely on simple classification confidence to distinguish between positive and negative samples (foreground or background), lacking more precise discrimination capabilities for objects. Moreover, the backbone network excels at focusing on local information during feature extraction, failing to capture the long-distance contextual semantics crucial for classification. Consequently, these trackers are highly susceptible to interference during actual tracking, leading to erroneous object segmentation and subsequent tracking failures, thereby compromising robustness. For this purpose, we propose a Siamese visual segmentation and tracking network with classification-rank loss and classification-aware (Siam2C). We design a classification-rank loss (CRL) algorithm to enlarge the margin between positive and negative samples, ensuring that positive samples are ranked higher than negative ones. This optimization enhances the network’s ability to learn from positive and negative samples, allowing the tracker to accurately select the object for segmentation and tracking rather than being misled by interfering targets. Additionally, we design a classification-aware attention module (CAM), which employs spatial and channel self-attention mechanisms to capture long-distance dependencies between different positions in the feature map. The module enhances the feature representation capability of the backbone network, providing richer global contextual semantic information for the tracking network’s classification decisions. Extensive experiments on the VOT2016, VOT2018, VOT2019, OTB100, UAV123, GOT-10k, DAVIS2016, and DAVIS2017 datasets demonstrate the outstanding performance of Siam2C.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12898 - 12921"},"PeriodicalIF":3.4,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Hypersphere Data Description for few-shot one-class classification","authors":"Yuchen Ren, Xiabi Liu, Liyuan Pan, Lijuan Niu","doi":"10.1007/s10489-024-05836-w","DOIUrl":"10.1007/s10489-024-05836-w","url":null,"abstract":"<p>Few-shot one-class classification (FS-OCC) is an important and challenging problem involving the recognition of a class using a limited number of positive training samples. Data description is essential for solving the FS-OCC problem as it delineates a region that separates positive data from other classes in the feature space. This paper introduces an effective FS-OCC model named Adaptive Hypersphere Data Description (AHDD). AHDD utilizes hypersphere-based data description with a learnable radius to determine the appropriate region for positive samples in the feature space. Both the radius and the feature network are learned concurrently using meta-learning. We propose a loss function for AHDD that enables the mutual adaptation of the radius and feature within a single FS-OCC task. AHDD significantly outperforms other state-of-the-art FS-OCC methods across various benchmarks and demonstrates strong performance on test sets with extreme class imbalance rates. Experimental results indicate that AHDD learns a robust feature representation, and the implementation of an adaptive radius can also improve the existing FS-OCC baselines.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12885 - 12897"},"PeriodicalIF":3.4,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Power allocation method based on modified social network search algorithm","authors":"Hongyuan Gao, Huishuang Li, Yun Lin, Jingya Ma","doi":"10.1007/s10489-024-05804-4","DOIUrl":"10.1007/s10489-024-05804-4","url":null,"abstract":"<div><p>With the increase of communication devices and demands, the problems of high power consumption, tight spectrum resources, and low energy efficiency in the two-layer heterogeneous network are the popular topics, which need to be solved urgently. For the purpose of solving these problems in a two-layer heterogeneous network consisting of femtocell base stations in randomly distributed a macrocell base station, which can also be called the Macrocell/Femtocell two-layer heterogeneous network, the hierarchical clustering algorithm is firstly used to cluster femtocell base stations in accordance with a distance threshold, the spectrum partitioning mechanism and non-orthogonal multiple access technique are combined to obtain spectrum allocation schemes for different users. Then, the modified social network search algorithm is used to simulate the power allocation problem in the two-layer heterogeneous network with system energy efficiency as the objective function. By comparing with the previous algorithms, the proposed algorithm’s superior performance is verified on the test functions. The results show that the proposed method can effectively improve spectrum utilization and reduce interference. The modified social network search algorithm is more robust and widely applicable regarding energy and computational efficiency.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12851 - 12884"},"PeriodicalIF":3.4,"publicationDate":"2024-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LegalATLE: an active transfer learning framework for legal triple extraction","authors":"Haiguang Zhang, Yuanyuan Sun, Bo Xu, Hongfei Lin","doi":"10.1007/s10489-024-05842-y","DOIUrl":"10.1007/s10489-024-05842-y","url":null,"abstract":"<p>Recently, the rich content of Chinese legal documents has attracted considerable scholarly attention. Legal Relational Triple Extraction which is a critical way to enable machines to understand the semantic information presents a significant challenge in Natural Language Processing, as it seeks to discern the connections between pairs of entities within legal case texts. This challenge is compounded by the intricate nature of legal language and the substantial expense associated with human annotation. Despite these challenges, existing models often overlook the incorporation of cross-domain features. To address this, we introduce LegalATLE, an innovative method for legal Relational Triple Extraction that integrates active learning and transfer learning, reducing the model’s reliance on annotated data and enhancing its performance within the target domain. Our model employs active learning to prudently assess and select samples with high information value. Concurrently, it applies domain adaptation techniques to effectively transfer knowledge from the source domain, thereby improving the model’s generalization and accuracy. Additionally, we have manually annotated a new theft-related triple dataset for use as the target domain. Comprehensive experiments demonstrate that LegalATLE outperforms existing efficient models by approximately 1.5%, reaching 92.90% on the target domain. Notably, with only 4% and 5% of the full dataset used for training, LegalATLE performs about 10% better than other models, demonstrating its effectiveness in data-scarce scenarios.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12835 - 12850"},"PeriodicalIF":3.4,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rong Lan, Bo Wang, Xiaoying Yu, Feng Zhao, Haowen Mi, Haiyan Yu, Lu Zhang
{"title":"Dynamic noise self-recovery ECM clustering algorithm with adaptive spatial constraints for image segmentation","authors":"Rong Lan, Bo Wang, Xiaoying Yu, Feng Zhao, Haowen Mi, Haiyan Yu, Lu Zhang","doi":"10.1007/s10489-024-05813-3","DOIUrl":"10.1007/s10489-024-05813-3","url":null,"abstract":"<div><p>Evidence c-means(ECM) has certain advantages in dealing with uncertainty and imprecision, and it is widely applied to data clustering and image segmentation. However, ECM does not utilize spatial information and unable to recover noise, resulting in poor performance for noisy image segmentation. To address these problems, we propose a dynamic noise self-recovery ECM clustering algorithm with adaptive spatial constraints for image segmentation. The proposed algorithm has the following novelties. Firstly, the non-local spatial information is modified by initializing the noise probability to obtain more reliable spatial information. Secondly, the adaptive constraint factors are constructed by using the absolute difference between the original image and the modified non-local spatial information, which can reduce the sensitivity of the algorithm to noise. Finally, the self-recovery factors are constructed on the basis of the neighborhood belief degrees. And a dynamic anti-noise distance is proposed to replace the Euclidean distance. The dynamic anti-noise distance is more suitable for noise self-recover, enabling noise self-recovery during the iterative process. Extensive experiments on synthetic, natural, SAR and MR images show that the proposed algorithm has good performance for image segmentation.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12791 - 12818"},"PeriodicalIF":3.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ran Zhu, Jian Peng, Wen Huang, Yujun He, Chengyi Tang
{"title":"Dynamic graph attention-guided graph clustering with entropy minimization self-supervision","authors":"Ran Zhu, Jian Peng, Wen Huang, Yujun He, Chengyi Tang","doi":"10.1007/s10489-024-05745-y","DOIUrl":"10.1007/s10489-024-05745-y","url":null,"abstract":"<p>Graph clustering is one of the most fundamental tasks in graph learning. Recently, numerous graph clustering models based on dual network (Auto-encoder+Graph Neural Network(GNN)) architectures have emerged and achieved promising results. However, we observe several limitations in the literature: 1) simple graph neural networks that fail to capture the intricate relationships between nodes are used for graph clustering tasks; 2) heterogeneous information is inadequately interacted and merged; and 3) the clustering boundaries are fuzzy in the feature space. To address the aforementioned issues, we propose a novel graph clustering model named <b>D</b>ynamic <b>G</b>raph <b>A</b>ttention-guided <b>G</b>raph <b>C</b>lustering with <b>E</b>ntropy <b>M</b>inimization self-supervision(DGAGC-EM). Specifically, we introduce DGATE, a graph auto-encoder based on dynamic graph attention, to capture the intricate relationships among graph nodes. Additionally, we perform feature enhancement from both global and local perspectives via the proposed Global-Local Feature Enhancement (GLFE) module. Finally, we propose a self-supervised strategy based on entropy minimization theory to guide network training process to achieve better performance and produce sharper clustering boundaries. Extensive experimental results obtained on four datasets demonstrate that our method is highly competitive with the SOTA methods.</p><p>The figure presents the overall framework of proposed Dynamic Graph Attention-guided Graph Clustering with Entropy Minimization selfsupervision(DGAGC-EM). Specifically, the Dynamic Graph Attetion Auto-Encoder Module is our proposed graph auto-encoder based on dynamic graph attention, to capture the intricate relationships among graph nodes. The Auto-Encoder Module is a basic autoencoder with simple MLPs to extract embeddings from node attributes. Additionally, the proposed Global-Local Feature Enhancement (GLFE) module perform feature enhancement from both global and local perspectives. Finally, the proposed Self-supervised Module guide network training process to achieve better performance and produce sharper clustering boundaries</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12819 - 12834"},"PeriodicalIF":3.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-geometric block diagonal representation subspace clustering with low-rank kernel","authors":"Maoshan Liu, Vasile Palade , Zhonglong Zheng","doi":"10.1007/s10489-024-05833-z","DOIUrl":"10.1007/s10489-024-05833-z","url":null,"abstract":"<div><p>The popular block diagonal representation subspace clustering approach shows high effectiveness in dividing a high-dimensional data space into the corresponding subspaces. However, existing subspace clustering algorithms have some weaknesses in achieving high clustering performance. This paper presents a multi-geometric block diagonal representation subspace clustering with low-rank kernel (MBDR-LRK) method that includes two major improvements. First, as visual data often exists on a Riemannian manifold not captured by Euclidean geometry, we harness the multi-order data complementarity to develop a multi-geometric block diagonal representation (MBDR) subspace clustering. Secondly, the proposed MBDR-LRK approach ensures the low-rankness in the mapped space, by adapting the kernel matrix to a pre-defined one rather than relying on a fixed kernel as in traditional methods. The paper also presents details on the monotonic decrease of the objective function and the boundedness and convergence of the affinity matrix, and the experimental results prove the convergence of the proposed method. Based on the MATLAB development environment, the proposed MBDR-LRK algorithm outperforms other related algorithms and obtained an accuracy of 88.70% on the ORL (40 classes), 89.39% on the Extended Yale B (38 classes), 50.22% on the AR (100 classes) and 75.47% on the COIL (50 classes) datasets.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12764 - 12790"},"PeriodicalIF":3.4,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SmartRAN: Smart Routing Attention Network for multimodal sentiment analysis","authors":"Xueyu Guo, Shengwei Tian, Long Yu, Xiaoyu He","doi":"10.1007/s10489-024-05839-7","DOIUrl":"10.1007/s10489-024-05839-7","url":null,"abstract":"<div><p>Multimodal sentiment analysis has received widespread attention from the research community in recent years; it aims to use information from different modalities to predict sentiment polarity. However, the model architecture of most existing methods is fixed, and data can only flow along an established path, which leads to poor generalization of the model to different types of data. Furthermore, most methods explore only intra- or intermodal interactions and do not combine the two. In this paper, we propose the <b>Smart</b> <b>R</b>outing <b>A</b>ttention <b>N</b>etwork (SmartRAN). SmartRAN can smartly select the data flow path on the basis of the smart routing attention module, effectively avoiding the disadvantages of poor adaptability and generalizability caused by a fixed model architecture. In addition, SmartRAN includes the learning process of both intra- and intermodal information, which can enhance the semantic consistency of comprehensive information and improve the learning ability of the model for complex relationships. Extensive experiments on two benchmark datasets, CMU-MOSI and CMU-MOSEI, prove that the proposed SmartRAN has superior performance to state-of-the-art models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"54 24","pages":"12742 - 12763"},"PeriodicalIF":3.4,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}