International Journal of Machine Learning and Cybernetics最新文献

筛选
英文 中文
A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy 基于群体线性递减策略的带有镜面反射的多策略混合布谷鸟搜索算法
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-05 DOI: 10.1007/s13042-024-02273-6
Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou
{"title":"A multi-strategy hybrid cuckoo search algorithm with specular reflection based on a population linear decreasing strategy","authors":"Chengtian Ouyang, Xin Liu, Donglin Zhu, Yangyang Zheng, Changjun Zhou, Chengye Zou","doi":"10.1007/s13042-024-02273-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02273-6","url":null,"abstract":"<p>The cuckoo search algorithm (CS), an algorithm inspired by the nest-parasitic breeding behavior of cuckoos, has proved its own effectiveness as a problem-solving approach in many fields since it was proposed. Nevertheless, the cuckoo search algorithm still suffers from an imbalance between exploration and exploitation as well as a tendency to fall into local optimization. In this paper, we propose a new hybrid cuckoo search algorithm (LHCS) based on linear decreasing of populations, and in order to optimize the local search of the algorithm and make the algorithm converge quickly, we mix the solution updating strategy of the Grey Yours sincerely, wolf optimizer (GWO) and use the linear decreasing rule to adjust the calling ratio of the strategy in order to balance the global exploration and the local exploitation; Second, the addition of a specular reflection learning strategy enhances the algorithm's ability to jump out of local optima; Finally, the convergence ability of the algorithm on different intervals and the adaptive ability of population diversity are improved using a population linear decreasing strategy. The experimental results on 29 benchmark functions from the CEC2017 test set show that the LHCS algorithm has significant superiority and stability over other algorithms when the quality of all solutions is considered together. In order to further verify the performance of the proposed algorithm in this paper, we applied the algorithm to engineering problems, functional tests, and Wilcoxon test results show that the comprehensive performance of the LHCS algorithm outperforms the other 14 state-of-the-art algorithms. In several engineering optimization problems, the practicality and effectiveness of the LHCS algorithm are verified, and the design cost can be greatly reduced by applying it to real engineering problems.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"12 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks 低维内在维度揭示了基于梯度学习的深度神经网络的阶段性转变
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02244-x
Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao
{"title":"Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks","authors":"Chengli Tan, Jiangshe Zhang, Junmin Liu, Zixiang Zhao","doi":"10.1007/s13042-024-02244-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02244-x","url":null,"abstract":"<p>Deep neural networks complete a feature extraction task by propagating the inputs through multiple modules. However, how the representations evolve with the gradient-based optimization remains unknown. Here we leverage the intrinsic dimension of the representations to study the learning dynamics and find that the training process undergoes a phase transition from expansion to compression under disparate training regimes. Surprisingly, this phenomenon is ubiquitous across a wide variety of model architectures, optimizers, and data sets. We demonstrate that the variation in the intrinsic dimension is consistent with the complexity of the learned hypothesis, which can be quantitatively assessed by the critical sample ratio that is rooted in adversarial robustness. Meanwhile, we mathematically show that this phenomenon can be analyzed in terms of the mutable correlation between neurons. Although the evoked activities obey a power-law decaying rule in biological circuits, we identify that the power-law exponent of the representations in deep neural networks predicted adversarial robustness well only at the end of the training but not during the training process. These results together suggest that deep neural networks are prone to producing robust representations by adaptively eliminating or retaining redundancies. The code is publicly available at https://github.com/cltan023/learning2022.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel abstractive summarization model based on topic-aware and contrastive learning 基于主题感知和对比学习的新型抽象摘要模型
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-04 DOI: 10.1007/s13042-024-02263-8
Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu
{"title":"A novel abstractive summarization model based on topic-aware and contrastive learning","authors":"Huanling Tang, Ruiquan Li, Wenhao Duan, Quansheng Dou, Mingyu Lu","doi":"10.1007/s13042-024-02263-8","DOIUrl":"https://doi.org/10.1007/s13042-024-02263-8","url":null,"abstract":"<p>The majority of abstractive summarization models are designed based on the Sequence-to-Sequence(Seq2Seq) architecture. These models are able to capture syntactic and contextual information between words. However, Seq2Seq-based summarization models tend to overlook global semantic information. Moreover, there exist inconsistency between the objective function and evaluation metrics of this model. To address these limitations, a novel model named ASTCL is proposed in this paper. It integrates the neural topic model into the Seq2Seq framework innovatively, aiming to capture the text’s global semantic information and guide the summary generation. Additionally, it incorporates contrastive learning techniques to mitigate the discrepancy between the objective loss and the evaluation metrics through scoring multiple candidate summaries. On CNN/DM XSum and NYT datasets, the experimental results demonstrate that the ASTCL model outperforms the other generic models in summarization task.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"48 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data 基于广义学习向量量化和自然近邻的不平衡数据去采样
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-03 DOI: 10.1007/s13042-024-02261-w
Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen
{"title":"Undersampling based on generalized learning vector quantization and natural nearest neighbors for imbalanced data","authors":"Long-Hui Wang, Qi Dai, Jia-You Wang, Tony Du, Lifang Chen","doi":"10.1007/s13042-024-02261-w","DOIUrl":"https://doi.org/10.1007/s13042-024-02261-w","url":null,"abstract":"<p>Imbalanced datasets can adversely affect classifier performance. Conventional undersampling approaches may lead to the loss of essential information, while oversampling techniques could introduce noise. To address this challenge, we propose an undersampling algorithm called GLNDU (Generalized Learning Vector Quantization and Natural Nearest Neighbors-based Undersampling). GLNDU utilizes Generalized Learning Vector Quantization (GLVQ) for computing the centroids of positive and negative instances. It also utilizes the concept of Natural Nearest Neighbors to identify majority-class instances in the overlapping region of the centroids of minority-class instances. Afterwards, these majority-class instances are removed, resulting in a new balanced training dataset that is used to train a foundational classifier. We conduct extensive experiments on 29 publicly available datasets, evaluating the performance using AUC and G_mean values. GLNDU demonstrates significant advantages over established methods such as SVM, CART, and KNN across different types of classifiers. Additionally, the results of the Friedman ranking and Nemenyi post-hoc test provide additional support for the findings obtained from the experiments.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"157 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A copy-move forgery detection technique using DBSCAN-based keypoint similarity matching 使用基于 DBSCAN 的关键点相似性匹配的复制移动伪造检测技术
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-03 DOI: 10.1007/s13042-024-02268-3
Soumya Mukherjee, Arup Kumar Pal, Soham Maji
{"title":"A copy-move forgery detection technique using DBSCAN-based keypoint similarity matching","authors":"Soumya Mukherjee, Arup Kumar Pal, Soham Maji","doi":"10.1007/s13042-024-02268-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02268-3","url":null,"abstract":"<p>In an era marked by the contrast between information and disinformation, the ability to differentiate between authentic and manipulated images holds immense importance for both security professionals and the scientific community. Copy-move forgery is widely practiced thus, sprang up as a prevalent form of image manipulation among different types of forgeries. In this counterfeiting process, a region of an image is copied and pasted into different parts of the same image to hide or replicate the same objects. As copy-move forgery is hard to detect and localize, a swift and efficacious detection scheme based on keypoint detection is introduced. Especially the localization of forged areas becomes more difficult when the forged image is subjected to different post-processing attacks and geometrical attacks. In this paper, a robust, translation-invariant, and efficient copy-move forgery detection technique has been introduced. To achieve this goal, we developed an AKAZE-driven keypoint-based forgery detection technique. AKAZE is applied to the LL sub-band of the SWT-transformed image to extract translation invariant features, rather than extracting them directly from the original image. We then use the DBSCAN clustering algorithm and a uniform quantizer on each cluster to form group pairs based on their feature descriptor values. To mitigate false positives, keypoint pairs are separated by a distance greater than a predefined shift vector distance. This process forms a collection of keypoints within each cluster by leveraging their similarities in feature descriptors. Our clustering-based similarity-matching algorithm effectively locates the forged region. To assess the proposed scheme we deploy it on different datasets with post-processing attacks ranging from blurring, color reduction, contrast adjustment, brightness change, and noise addition. Even our method successfully withstands geometrical manipulations like rotation, skewing, and different affine transform attacks. Visual outcomes, numerical results, and comparative analysis show that the proposed model accurately detects the forged area with fewer false positives and is more computationally efficient than other methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"43 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141551463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Class-structure preserving multi-view correlated discriminant analysis for multiblock data 针对多块数据的类别结构保存多视角相关判别分析
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-02 DOI: 10.1007/s13042-024-02270-9
Sankar Mondal, Pradipta Maji
{"title":"Class-structure preserving multi-view correlated discriminant analysis for multiblock data","authors":"Sankar Mondal, Pradipta Maji","doi":"10.1007/s13042-024-02270-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02270-9","url":null,"abstract":"<p>With the rapid development in data acquisition methods, multiple data sources are now becoming available to explain different views of an object. This consequently introduces several new challenges in integrating the high dimensional, distinct, and heterogeneous views under multi-view learning (MVL) framework. The multiset canonical correlation analysis (MCCA) is a popular subspace learning technique in MVL, which forms a common latent space by maximizing the pairwise correlation across all the views. However, MCCA does not utilize the class label information of the objects and is unable to handle the data non-linearity. Although there exist a few supervised extensions of MCCA, they lack productive use of intra-view and inter-view consistency and/or inconsistency information while using the class label. In this regard, a supervised subspace learning method, termed as class-structure preserving multi-view correlated discriminant analysis (CSP-MvCDA), is proposed by judiciously integrating the merits of MCCA, linear discriminant analysis (LDA), and a locality preserving norm. The proposed method jointly optimizes the inter-set correlation across all the views and intra-set discrimination in each view to obtain a common discriminative latent space, where the shared and complementary information across multiple views is exploited. The locality preserving norm with prior class labels helps to preserve the local class-structure of the data, while the LDA maintains its global class-structure. To show the effectiveness of the proposed method, several cancer and benchmark data sets are used. The experimental results establish that the proposed CSP-MvCDA method is superior to several state-of-the-art algorithms in terms of classification performance.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"49 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The concept information of graph granule with application to knowledge graph embedding 图颗粒的概念信息及其在知识图嵌入中的应用
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02267-4
Jiaojiao Niu, Degang Chen, Yinglong Ma, Jinhai Li
{"title":"The concept information of graph granule with application to knowledge graph embedding","authors":"Jiaojiao Niu, Degang Chen, Yinglong Ma, Jinhai Li","doi":"10.1007/s13042-024-02267-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02267-4","url":null,"abstract":"<p>Knowledge graph embedding (KGE) has become one of the most effective methods for the numerical representation of entities and their relations in knowledge graphs. Traditional methods primarily utilise triple facts, structured as (head entity, relation, tail entity), as the basic knowledge units in the learning process and use additional external information to improve the performance of models. Since triples are sometimes less than adequate and external information is not always available, obtaining structured internal knowledge from knowledge graphs (KGs) naturally becomes a feasible method for KGE learning. Motivated by this, this paper employs formal concept analysis (FCA) to mine deterministic concept knowledge in KGs and proposes a novel KGE model by taking the concept information into account. More specifically, triples sharing the same head entity are organised into knowledge structures named graph granules, and then were transformed into concept lattices, based on which a novel lattice-based KGE model (TransGr) is proposed for knowledge graph completion. TransGr assumes that entities and relations exist in different granules and uses a matrix (obtained by fusing concepts from concept lattice) for quantitatively depicting the graph granule. Afterwards, it forces entities and relations to meet graph granule constraints when learning vector representations of KGs. Experiments on link prediction and triple classification demonstrated that the proposed TransGr is effective on the datasets with relatively complete graph granules.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"75 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141524450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential attention layer-wise fusion network for multi-view classification 用于多视角分类的顺序注意层融合网络
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02260-x
Qing Teng, Xibei Yang, Qiguo Sun, Pingxin Wang, Xun Wang, Taihua Xu
{"title":"Sequential attention layer-wise fusion network for multi-view classification","authors":"Qing Teng, Xibei Yang, Qiguo Sun, Pingxin Wang, Xun Wang, Taihua Xu","doi":"10.1007/s13042-024-02260-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02260-x","url":null,"abstract":"<p>Graph convolutional network has shown excellent performance in multi-view classification. Currently, to output a fused node embedding representation in multi-view scenarios, existing researches tend to ensure the consistency of embedded node information among multiple views. However, they pay much attention to the immediate neighbors information rather than multi-order node information which can capture complex relationships and structures to enhance feature propagation. Furthermore, the embedded node information in each convolutional layer has not been fully utilized because the consistency is frequently achieved by the final convolutional layer. To tackle these limitations, we develop a new end-to-end multi-view learning architecture: sequential attention Layer-wise Fusion Network for multi-view classification (SLFNet). Motivated by the fact that for each view, multi-order node information is hidden in the multiple layer-wise node embedding representations, a set of sequential attentions can then be calculated over those multiple layers, which provides a novel fusion strategy from the perspectives of multi-order. The contributions of our architecture are: (1) capturing multi-order node information instead of using the immediate neighbors, thereby obtaining more accurate node embedding representations; (2) designing a sequential attention module that allows adaptive learning of node embedding representation for each layer, thereby attentively fusing these layer-wise node embedding representations. Our experiments, focusing on semi-supervised node classification tasks, highlight the superiorities of SLFNet compared to state-of-the-art approaches. Reports on deeper layer convolutional results further confirm its effectiveness in addressing over-smoothing problem.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"13 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning-guided intrusion detection and neural key exchange for safeguarding patient data on the internet of medical things 联合学习引导的入侵检测和神经密钥交换,用于保护医疗物联网上的患者数据
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-07-01 DOI: 10.1007/s13042-024-02269-2
Chongzhou Zhong, Arindam Sarkar, Sarbajit Manna, Mohammad Zubair Khan, Abdulfattah Noorwali, Ashish Das, Koyel Chakraborty
{"title":"Federated learning-guided intrusion detection and neural key exchange for safeguarding patient data on the internet of medical things","authors":"Chongzhou Zhong, Arindam Sarkar, Sarbajit Manna, Mohammad Zubair Khan, Abdulfattah Noorwali, Ashish Das, Koyel Chakraborty","doi":"10.1007/s13042-024-02269-2","DOIUrl":"https://doi.org/10.1007/s13042-024-02269-2","url":null,"abstract":"<p>To improve the security of the Internet of Medical Things (IoMT) in healthcare, this paper offers a Federated Learning (FL)-guided Intrusion Detection System (IDS) and an Artificial Neural Network (ANN)-based key exchange mechanism inside a blockchain framework. The IDS are essential for spotting network anomalies and taking preventative action to guarantee the secure and dependable functioning of IoMT systems. The suggested method integrates FL-IDS with a blockchain-based ANN-based key exchange mechanism, providing several important benefits: (1) FL-based IDS creates a shared ledger that aggregates nearby weights and transmits historical weights that have been averaged, lowering computing effort, eliminating poisoning attacks, and improving data visibility and integrity throughout the shared database. (2) The system uses edge-based detection techniques to protect the cloud in the case of a security breach, enabling quicker threat recognition with less computational and processing resource usage. FL’s effectiveness with fewer data samples plays a part in this benefit. (3) The bidirectional alignment of ANNs ensures a strong security framework and facilitates the production of keys inside the IoMT network on the blockchain. (4) Mutual learning approaches synchronize ANNs, making it easier for IoMT devices to distribute synchronized keys. (5) XGBoost and ANN models were put to the test using BoT-IoT datasets to gauge how successful the suggested method is. The findings show that ANN demonstrates greater performance and dependability when dealing with heterogeneous data available in IoMT, such as ICU (Intensive Care Unit) data in the medical profession, compared to alternative approaches studied in this study. Overall, this method demonstrates increased security measures and performance, making it an appealing option for protecting IoMT systems, especially in demanding medical settings like ICUs.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"17 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141524451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HOGFormer: high-order graph convolution transformer for 3D human pose estimation HOGFormer:用于三维人体姿态估计的高阶图卷积变换器
IF 5.6 3区 计算机科学
International Journal of Machine Learning and Cybernetics Pub Date : 2024-06-29 DOI: 10.1007/s13042-024-02262-9
Yuhong Xie, Chaoqun Hong, Weiwei Zhuang, Lijuan Liu, Jie Li
{"title":"HOGFormer: high-order graph convolution transformer for 3D human pose estimation","authors":"Yuhong Xie, Chaoqun Hong, Weiwei Zhuang, Lijuan Liu, Jie Li","doi":"10.1007/s13042-024-02262-9","DOIUrl":"https://doi.org/10.1007/s13042-024-02262-9","url":null,"abstract":"<p>The combination of graph convolution network (GCN) and Transformer has shown promising results in 3D human pose estimation (HPE) tasks when lifting the 2D to 3D poses. However, recent approaches to 3D HPE still face difficulties such as depth ambiguity and occlusion. To address these issues, we suggest a novel 3D HPE architecture, termed High-Order Graph Convolution Transformer (HOGFormer). HOGFormer consists of three core components: the Chebyshev Graph Convolution (CGConv) module, the Graph-based Dynamic Adjacency Matrix Transformer (GDAMFormer) module, and the High-Order Graph Convolution (HOGConv) module. In more detail, the CGConv module can further increase the estimation accuracy by approximating the graph convolution with Chebyshev polynomials. The GDAMFormer module efficiently addresses issues like self-occlusion and depth blur by using a dynamic adjacency matrix to represent the dynamic relationships among joints. The HOGConv module can effectively extract local features by capturing the local physical dependencies of skeleton connections. With the integration of these modules, the proposed architecture can effectively capture global and local information. We evaluate our architecture quantitatively and qualitatively on the popular benchmark dataset Human3.6M. Our experiments demonstrate that HOGFormer achieves state-of-the-art performance.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"23 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141510268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信