{"title":"基于骨骼的严重遮挡手势识别条件扩散模型","authors":"Jinting Liu;Minggang Gan;Yao Du;Keyi Guan;Jia Guo","doi":"10.1109/LSP.2025.3563445","DOIUrl":null,"url":null,"abstract":"In the field of skeleton-based gesture recognition, occlusion remains a significant challenge, significantly degrading performance when key joints are occluded or disturbed. To tackle this issue, we propose DiffTrans, a practical conditional diffusion model for occlusion recognition, which enables skeleton-based gesture recognition under high occlusion by generating more likely samples. This study addresses the hand skeleton occlusion problem by framing it as a conditional denoising problem, where unoccluded data serve as observations and occluded data as repair targets. We employ a conditional diffusion model to impute the missing skeleton data and the DSTANet model, which is based on the transformer, to learn the skeleton feature representations. Research results show that the DiffTrans outperforms existing methods under various occlusion modes, maintaining high performance even in scenarios with a high missing rate.","PeriodicalId":13154,"journal":{"name":"IEEE Signal Processing Letters","volume":"32 ","pages":"1970-1974"},"PeriodicalIF":3.2000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Conditional Diffusion Model for Skeleton-Based Gesture Recognition With Severe Occlusions\",\"authors\":\"Jinting Liu;Minggang Gan;Yao Du;Keyi Guan;Jia Guo\",\"doi\":\"10.1109/LSP.2025.3563445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of skeleton-based gesture recognition, occlusion remains a significant challenge, significantly degrading performance when key joints are occluded or disturbed. To tackle this issue, we propose DiffTrans, a practical conditional diffusion model for occlusion recognition, which enables skeleton-based gesture recognition under high occlusion by generating more likely samples. This study addresses the hand skeleton occlusion problem by framing it as a conditional denoising problem, where unoccluded data serve as observations and occluded data as repair targets. We employ a conditional diffusion model to impute the missing skeleton data and the DSTANet model, which is based on the transformer, to learn the skeleton feature representations. Research results show that the DiffTrans outperforms existing methods under various occlusion modes, maintaining high performance even in scenarios with a high missing rate.\",\"PeriodicalId\":13154,\"journal\":{\"name\":\"IEEE Signal Processing Letters\",\"volume\":\"32 \",\"pages\":\"1970-1974\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Signal Processing Letters\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10974577/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Signal Processing Letters","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10974577/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
Conditional Diffusion Model for Skeleton-Based Gesture Recognition With Severe Occlusions
In the field of skeleton-based gesture recognition, occlusion remains a significant challenge, significantly degrading performance when key joints are occluded or disturbed. To tackle this issue, we propose DiffTrans, a practical conditional diffusion model for occlusion recognition, which enables skeleton-based gesture recognition under high occlusion by generating more likely samples. This study addresses the hand skeleton occlusion problem by framing it as a conditional denoising problem, where unoccluded data serve as observations and occluded data as repair targets. We employ a conditional diffusion model to impute the missing skeleton data and the DSTANet model, which is based on the transformer, to learn the skeleton feature representations. Research results show that the DiffTrans outperforms existing methods under various occlusion modes, maintaining high performance even in scenarios with a high missing rate.
期刊介绍:
The IEEE Signal Processing Letters is a monthly, archival publication designed to provide rapid dissemination of original, cutting-edge ideas and timely, significant contributions in signal, image, speech, language and audio processing. Papers published in the Letters can be presented within one year of their appearance in signal processing conferences such as ICASSP, GlobalSIP and ICIP, and also in several workshop organized by the Signal Processing Society.