Yahui Zhang , Shaodi You , Sezer Karaoglu , Theo Gevers
{"title":"3D human pose estimation and action recognition using fisheye cameras: A survey and benchmark","authors":"Yahui Zhang , Shaodi You , Sezer Karaoglu , Theo Gevers","doi":"10.1016/j.patcog.2024.111334","DOIUrl":null,"url":null,"abstract":"<div><div>3D human pose estimation based on visual information aims to predict 3D poses of humans in images or videos. The aim of human action recognition is to classify what kind of actions people do. Both topics are widely studied in the field of computer vision.</div><div>Existing methods mainly focus on 3D human pose estimation and human action recognition using images/videos recorded by perspective cameras. In contrast to perspective cameras, fisheye cameras use wide-angle lenses capturing wider field-of-views (FOV). Fisheye cameras are used in many applications such as surveillance and autonomous driving.</div><div>In this paper, a survey is given on monocular 3D human pose estimation and action recognition. A new benchmark dataset is proposed using a fisheye camera to quantitatively compare and analyze existing methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"162 ","pages":"Article 111334"},"PeriodicalIF":7.5000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324010859","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
3D human pose estimation based on visual information aims to predict 3D poses of humans in images or videos. The aim of human action recognition is to classify what kind of actions people do. Both topics are widely studied in the field of computer vision.
Existing methods mainly focus on 3D human pose estimation and human action recognition using images/videos recorded by perspective cameras. In contrast to perspective cameras, fisheye cameras use wide-angle lenses capturing wider field-of-views (FOV). Fisheye cameras are used in many applications such as surveillance and autonomous driving.
In this paper, a survey is given on monocular 3D human pose estimation and action recognition. A new benchmark dataset is proposed using a fisheye camera to quantitatively compare and analyze existing methods.
期刊介绍:
The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.