{"title":"用于人体骨骼运动预测的关键帧加权双通道注意力 GCN 模型","authors":"Wenwen Zhang, Jianfeng Tu, Siyu Li, Lingfeng Liu","doi":"10.1007/s10489-025-06532-z","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate prediction of human skeletal motion sequences is critical for human activity analysis and low-latency motion reconstruction applications. While many studies focus on frame-by-frame prediction model designs, the keyframes in a motion sequence may contain more spatial-temporal information than the other keyframes do. To address the importance of keyframes, this work introduces a heterogeneous keyframe selection and fusion method to discriminate the importance of different motion frames from historical observations for prediction. Specifically, we propose an adaptive keyframe selection algorithm to iteratively select the keyframes and a nonlinear heterogeneous interpolation method to reconstruct the transitional frames. By merging them with the original motion sequence, the semantics of the original motion are preserved, and the importance of the keyframes is highlighted. A graph convolutional network (GCN) is designed for prediction with dual-channel attention to incorporate motion patterns in longer-term historical records to improve motion feature exploration. A comprehensive evaluation of the model is performed on the Human3.6M and AMASS datasets, which shows significant improvement in motion prediction over long-term methods (<span>\\(\\ge \\)</span> 320 ms) over the state-of-the-art methods in terms of the 3D mean per joint position error (MPJPE).</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A keyframe weighted dual-channel attention GCN model for human skeleton motion prediction\",\"authors\":\"Wenwen Zhang, Jianfeng Tu, Siyu Li, Lingfeng Liu\",\"doi\":\"10.1007/s10489-025-06532-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate prediction of human skeletal motion sequences is critical for human activity analysis and low-latency motion reconstruction applications. While many studies focus on frame-by-frame prediction model designs, the keyframes in a motion sequence may contain more spatial-temporal information than the other keyframes do. To address the importance of keyframes, this work introduces a heterogeneous keyframe selection and fusion method to discriminate the importance of different motion frames from historical observations for prediction. Specifically, we propose an adaptive keyframe selection algorithm to iteratively select the keyframes and a nonlinear heterogeneous interpolation method to reconstruct the transitional frames. By merging them with the original motion sequence, the semantics of the original motion are preserved, and the importance of the keyframes is highlighted. A graph convolutional network (GCN) is designed for prediction with dual-channel attention to incorporate motion patterns in longer-term historical records to improve motion feature exploration. A comprehensive evaluation of the model is performed on the Human3.6M and AMASS datasets, which shows significant improvement in motion prediction over long-term methods (<span>\\\\(\\\\ge \\\\)</span> 320 ms) over the state-of-the-art methods in terms of the 3D mean per joint position error (MPJPE).</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 7\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06532-z\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06532-z","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
A keyframe weighted dual-channel attention GCN model for human skeleton motion prediction
Accurate prediction of human skeletal motion sequences is critical for human activity analysis and low-latency motion reconstruction applications. While many studies focus on frame-by-frame prediction model designs, the keyframes in a motion sequence may contain more spatial-temporal information than the other keyframes do. To address the importance of keyframes, this work introduces a heterogeneous keyframe selection and fusion method to discriminate the importance of different motion frames from historical observations for prediction. Specifically, we propose an adaptive keyframe selection algorithm to iteratively select the keyframes and a nonlinear heterogeneous interpolation method to reconstruct the transitional frames. By merging them with the original motion sequence, the semantics of the original motion are preserved, and the importance of the keyframes is highlighted. A graph convolutional network (GCN) is designed for prediction with dual-channel attention to incorporate motion patterns in longer-term historical records to improve motion feature exploration. A comprehensive evaluation of the model is performed on the Human3.6M and AMASS datasets, which shows significant improvement in motion prediction over long-term methods (\(\ge \) 320 ms) over the state-of-the-art methods in terms of the 3D mean per joint position error (MPJPE).
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.