Yangyang Deng, Jianxin Ma, Yajun Li, Min Zhang, Li Wang
{"title":"Ternary symmetric fusion network for camouflaged object detection","authors":"Yangyang Deng, Jianxin Ma, Yajun Li, Min Zhang, Li Wang","doi":"10.1007/s10489-023-04898-6","DOIUrl":"10.1007/s10489-023-04898-6","url":null,"abstract":"<div><p>Camouflage object detection (COD) is designed to locate objects that are “seamlessly” embedded in the surrounding environment. Camouflaged object detection is a challenging task due to the high intrinsic similarities between objects and their backgrounds, as well as the low boundary contrast between them. To address this problem, this paper proposes a new ternary symmetric fusion network (TSFNet), which can detect camouflaged objects by fully fusing features of different levels and scales. Specifically, the network proposed in this paper mainly contains two key modules: the location-attention search (LAS) module and the ternary symmetric interaction fusion (TSIF) module. The location-attention search module makes full use of contextual information to position potential target objects from a global perspective while enhancing feature representation and guiding feature fusion. The ternary symmetric interaction fusion module consists of three branches: bilateral branches gather rich contextual information of multi-level features, and a middle branch provides fusion attention coefficients for the other two branches. The strategy can effectively achieve information fusion between low- and high-level features, and then achieve the refinement of edge details. Experimental results show that the method is an effective COD model and outperforms existing models. Compared with the existing model SINetV2, TSFNet significantly improves the performance by 3.5% weighted F-measure and 8.1% MAE on the COD10K.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25216 - 25231"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kaide Huang, Wentao Dong, Jie Li, Yuanyuan Chen, Jie Zhong, Zhang Yi
{"title":"GFF-Net: Graph-based feature fusion network for diagnosing plus disease in retinopathy of prematurity","authors":"Kaide Huang, Wentao Dong, Jie Li, Yuanyuan Chen, Jie Zhong, Zhang Yi","doi":"10.1007/s10489-023-04766-3","DOIUrl":"10.1007/s10489-023-04766-3","url":null,"abstract":"<div><p>Retinopathy of prematurity (ROP) is a retinal proliferative disorder, and it is the primary cause of childhood blindness. Accurate and convenient automatic diagnostic tools are required to assist ophthalmologists in diagnosing ROP. Existing methods only extract information from fundus image captured from posterior angle, while images captured from other angles are ignored, which limits the performance of the algorithm. In this paper, we propose a graph-based feature fusion network (GFF-Net) that can jointly analyze multiple images and make full use of the relevant information between these images to diagnose the plus disease in ROP. The convolutional features of different fundus images are connected into a graph, where the edges of the graph model the correlation between these images. A graph-based feature fusion module is proposed to aggregate features from the constructed feature graph and produce the final prediction. We compared the proposed GFF-Net with state-of-the-art methods on a clinical dataset and a low-quality “attack dataset\". The GFF-Net achieved superior performance compared to other methods on both datasets. The results show that the proposed GFF-Net could be more effective than existing methods in clinical practice.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25259 - 25281"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STRAN: Student expression recognition based on spatio-temporal residual attention network in classroom teaching videos","authors":"Zheng Chen, Meiyu Liang, Zhe Xue, Wanying Yu","doi":"10.1007/s10489-023-04858-0","DOIUrl":"10.1007/s10489-023-04858-0","url":null,"abstract":"<div><p>In order to obtain the state of students’ listening in class objectively and accurately, we can obtain students’ emotions through their expressions in class and cognitive feedback through their behaviors in class, and then integrate the two to obtain a comprehensive assessment results of classroom status. However, when obtaining students’ classroom expressions, the major problem is how to accurately and efficiently extract the expression features from the time dimension and space dimension of the class videos. In order to solve the above problems, we propose a class expression recognition model based on spatio-temporal residual attention network (STRAN), which could extract facial expression features through convolution operation in both time and space dimensions on the basis of limited resources, shortest time consumption and optimal performance. Specifically, STRAN firstly uses the residual network with the three-dimensional convolution to solve the problem of network degradation when the depth of the convolutional neural network increases, and the convergence speed of the whole network is accelerated at the same number of layers. Secondly, the spatio-temporal attention mechanism is introduced so that the network can effectively focus on the important video frames and the key areas within the frames. In order to enhance the comprehensiveness and correctness of the final classroom evaluation results, we use deep convolutional neural network to capture students’ behaviors while obtaining their classroom expressions. Then, an intelligent classroom state assessment method(Weight_classAssess) combining students’ expressions and behaviors is proposed to evaluate the classroom state. Finally, on the basis of the public datasets CK+ and FER2013, we construct two more comprehensive synthetic datasets CK+_Class and FER2013_Class, which are more suitable for the scene of classroom teaching, by adding some collected video sequences of students in class and images of students’ expressions in class. The proposed method is compared with the existing methods, and the results show that STRAN can achieve 93.84% and 80.45% facial expression recognition rates on CK+ and CK+_Class datasets, respectively. The accuracy rate of classroom intelligence assessment of students based on Weight_classAssess also reaches 78.19%, which proves the effectiveness of the proposed method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25310 - 25329"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced generative adversarial networks for bearing imbalanced fault diagnosis of rotating machinery","authors":"Yandong Hou, Jiulong Ma, Jinjin Wang, Tianzhi Li, Zhengquan Chen","doi":"10.1007/s10489-023-04870-4","DOIUrl":"10.1007/s10489-023-04870-4","url":null,"abstract":"<p>Traditional rolling bearing fault diagnosis approaches require a large amount of fault data in advance, while some specific fault data is difficult to obtain in engineering scenarios. This imbalanced fault data problem seriously affects the accuracy of fault diagnosis. To improve the accuracy under imbalanced data conditions, we propose a novel data augmentation method of Enhanced Generative Adversarial Networks with Data Selection Module (EGAN-DSM). Firstly, a network enhancement module is designed, which quantifies antagonism between the generator and discriminator through loss value. And the module determines whether to iteratively enhance the networks with weak adversarial ability. Secondly, a Data Selected Module (DSM) is constructed using Hilbert space distance for screening generated data, and the screened data is mixed with original imbalanced data to reconstruct balanced data sets. Then, Deep Convolutional Neural Networks with Wide First-layer Kernels (WDCNN) is used for fault diagnosis. Finally, the method is verified by data measured on a rotating machine experimental platform. The results show that our method has high fault diagnosis accuracy under the condition of imbalanced data.</p>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25201 - 25215"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint learning of graph and latent representation for unsupervised feature selection","authors":"Xijiong Xie, Zhiwen Cao, Feixiang Sun","doi":"10.1007/s10489-023-04893-x","DOIUrl":"10.1007/s10489-023-04893-x","url":null,"abstract":"<div><p>Data samples in real-world applications are not only related to high-dimensional features, but also related to each other. To fully exploit the interconnection between data samples, some recent methods embed latent representation learning into unsupervised feature selection and are proven effective. Despite superior performance, we observe that existing methods first predefine a similarity graph, and then perform latent representation learning based feature selection with this graph. Since fixed graph is obtained from the original feature space containing noisy features and the graph construction process is independent of the feature selection task, this makes the prefixed graph unreliable and ultimately hinders the efficiency of feature selection. To solve this problem, we propose joint learning of graph and latent representation for unsupervised feature selection (JGLUFS). Different from previous methods, we integrate adaptive graph construction into a feature selection method based on the latent representation learning, which not only reduces the impact of external conditions on the quality of graph but also enhances the connection between graph learning and latent representation learning for benefiting the feature selection task. These three basic tasks, including graph learning, latent representation learning and feature selection, cooperate with each other and lead to a better solution. An efficient algorithm with guaranteed convergence is carefully designed to solve the optimization problem of the algorithm. Extensive clustering experiments verify the competitiveness of JGLUFS compared to several state-of-the-art algorithms.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25282 - 25295"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A two-stage progressive shadow removal network","authors":"Zile Xu, Xin Chen","doi":"10.1007/s10489-023-04856-2","DOIUrl":"10.1007/s10489-023-04856-2","url":null,"abstract":"<div><p>Removing image shadows has been a challenging task in computer vision due to its diversity and complexity. Shadow removal techniques have been greatly enhanced by deep learning and shadow image datasets, but state-of-the-art methods generally consider the information of the shadow and its neighborhood, ignoring the correlation of the features between the shadow and non-shadow regions. It leads to the resulting image presenting poor overall consistency and unnatural boundary between the original shadow and non-shadow areas. To obtain a consistent and natural shadow removal result, a two-stage progressive shadow removal network is proposed. The first stage performs a multi-exposure fusion network (MEFN) to roughly recover the shadow region features, while in the second stage, a fine-recovery network (FRN) is performed to extract the correlation among the global image contexts, accompanied by a detail feature fusion step. This coarse-to-fine process improves the overall effect of shadow removal, in terms of image quality and boundary consistency. Extensive experiments on the widely used ISTD, ISTD+ and SRD datasets show that the proposed shadow removal network outperforms most of the state-of-the-art methods.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25296 - 25309"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised nested Dirichlet finite mixture model for clustering","authors":"Fares Alkhawaja, Nizar Bouguila","doi":"10.1007/s10489-023-04888-8","DOIUrl":"10.1007/s10489-023-04888-8","url":null,"abstract":"<div><p>The Dirichlet distribution is widely used in the context of mixture models. Despite its flexibility, it still suffers from some limitations, such as its restrictive covariance matrix and its direct proportionality between its mean and variance. In this work, a generalization over the Dirichlet distribution, namely the Nested Dirichlet distribution, is introduced in the context of finite mixture model providing more flexibility and overcoming the mentioned drawbacks, thanks to its hierarchical structure. The model learning is based on the generalized expectation-maximization algorithm, where parameters are initialized with the method of moments and estimated through the iterative Newton-Raphson method. Moreover, the minimum message length criterion is proposed to determine the best number of components that describe the data clusters by the finite mixture model. The Nested Dirichlet distribution is proven to be part of the exponential family, which offers several advantages, such as the calculation of several probabilistic distances in closed forms. The performance of the Nested Dirichlet mixture model is compared to the Dirichlet mixture model, the generalized Dirichlet mixture model, and the Convolutional Neural Network as a deep learning network. The excellence of the powerful proposed framework is validated through this comparison via challenging datasets. The hierarchical feature of the model is applied to real-world challenging tasks such as hierarchical cluster analysis and hierarchical feature learning, showing a significant improvement in terms of accuracy.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25232 - 25258"},"PeriodicalIF":5.3,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SSKGE: a time-saving knowledge graph embedding framework based on structure enhancement and semantic guidance","authors":"Tao Wang, Bo Shen, Yu Zhong","doi":"10.1007/s10489-023-04896-8","DOIUrl":"10.1007/s10489-023-04896-8","url":null,"abstract":"<div><p>In knowledge graph embedding, an attempt is made to embed the objective facts and relationships expressed in the form of triplets into multidimensional vector space, facilitating various applications, such as link prediction and question answering. Structure embedding models focus on the graph structure while the importance of language semantics in inferring similar entities and relations is ignored. Semantic embedding models use pretrained language models to learn entity and relation embeddings based on text information, but they do not fully exploit graph structures that reflect relation patterns and mapping attributes. Structure and semantic information in knowledge graphs represent different hierarchical properties that are indispensable for comprehensive knowledge representation. In this paper, we propose a general knowledge graph embedding framework named SSKGE, which considers both the graph structure and language semantics and learns these two complementary characteristics to integrate entity and relation representations. To compensate for semantic embedding approaches that ignore the graph structure, we first design a structure loss function to explicitly model the graph structure attributes. Second, we leverage a pretrained language model that has been fine-tuned by the structure loss to guide the structure embedding approaches in enhancing the semantic information they lack and obtaining universal knowledge representations. Specifically, guidance is provided by a distance function that makes the spatial distribution of the two types of graph embeddings have a certain similarity. SSKGE significantly reduces the time cost of using a pretrained language model to complete a knowledge graph. Common knowledge graph embedding models such as TransE, DistMult, ComplEx, RotatE, PairRE, and HousE have achieved better results with multiple datasets, including FB15k, FB15k-237, WN18, and WN18RR, using the SSKGE framework. Extensive experiments and analyses have verified the effectiveness and practicality of SSKGE.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25171 - 25183"},"PeriodicalIF":5.3,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Self-supervised robust Graph Neural Networks against noisy graphs and noisy labels","authors":"Jinliang Yuan, Hualei Yu, Meng Cao, Jianqing Song, Junyuan Xie, Chongjun Wang","doi":"10.1007/s10489-023-04836-6","DOIUrl":"10.1007/s10489-023-04836-6","url":null,"abstract":"<div><p>In the paper, we first explore a novel problem of training the robust Graph Neural Networks (GNNs) against noisy graphs and noisy labels. To the problem, we propose a general Self-supervised Robust Graph Neural Network framework that consists of three modules: graph structure learning, sample selection, and self-supervised learning. Specifically, we first employ a graph structure learning approach to obtain an optimal graph structure. Next, using this structure, we use a clustering algorithm to generate pseudo-labels that represent the clusters. We then design a sample selection strategy based on these pseudo-labels to select nodes with clean labels. Additionally, we introduce a self-supervised learning technique where low-level layer parameters are shared with GNNs to predict pseudo-labels. We jointly train the graph structure learning module, the GNNs model, and the self-supervised model. Finally, we conduct extensive experiments on four real-world datasets, demonstrating the superiority of our methods compared with state-of-the-art methods for semi-supervised node classification under noisy graphs and noisy labels.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25154 - 25170"},"PeriodicalIF":5.3,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiview learning of homogeneous neighborhood of nodes for the node representation of heterogeneous graph","authors":"Dongjie Li, Dong Li, Hao Liu","doi":"10.1007/s10489-023-04907-8","DOIUrl":"10.1007/s10489-023-04907-8","url":null,"abstract":"<div><p>Multiview learning has caught the interest of many graph researchers because it can learn richer information about graphs from different views. Recently, multiview learning, as a novel paradigm in learning, has been widely applied to learn nodes representation of heterogeneous graphs, such as MVSE, HeMI, etc., they only utilize the local homogeneous neighborhood information of nodes, which degrades the quality of nodes representation. We are aware that the heterogeneous graph representation aims to drive the representation of a node to be near the homogeneous neighbors that are similar to it in the heterogeneous graph and far wary from heterogeneous neighbors. Besides, in the heterogeneous graph, linked nodes are more likely to be dissimilar, but remote nodes may have some similarities. Therefore, we can move the locality of a node to discover more homogenous neighbors’ information to improve the quality of node representation. In this work, we propose an unsupervised heterogeneous graph embedding technique that is simple yet efficient; and devise a systematic way to learn node embeddings from the local and global views of the homogeneous neighborhood of nodes by introducing a regularization framework that minimizes the disagreements among the local and global node embeddings under the specific meta-path. Inspired by Personal PageRank graph diffusion, we expand an infinite meta path-based restart random walk to obtain global homogenous neighbors of nodes and construct a meta path-based diffusion matrix to represent the relation between global homogenous neighbors and nodes. Finally, we employ mini-batch gradient descent to train our model to reduce computational consumption. Experimental findings demonstrate that our approach outperforms a wide variety of baselines on different datasets when it comes to node classification and node clustering tasks, with a particularly impressive 7.22% improvement over the best baseline on the ACM dataset.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"53 21","pages":"25184 - 25200"},"PeriodicalIF":5.3,"publicationDate":"2023-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"71909071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}