{"title":"Hand Detection using Deformable Part Models on an Egocentric Perspective","authors":"Sergio R. Cruz, Antoni B. Chan","doi":"10.1109/DICTA.2018.8615781","DOIUrl":null,"url":null,"abstract":"The egocentric perspective is a recent perspective brought by new devices like the GoPro and Google Glass, which are becoming more available to the public. The hands are the most consistent objects in the egocentric perspective and they can represent more information about people and their activities, but the nature of the perspective and the ever changing shape of the hands makes them difficult to detect. Previous work has focused on indoor environments or controlled data since it brings simpler ways to approach it, but in this work we use data with changing background and variable illumination, which makes it more challenging. We use a Deformable Part Model based approach to generate hand proposals since it can handle the many gestures the hand can adopt and rivals other techniques on locating the hands while reducing the number of proposals. We also use the location where the hands appear and size in the image to reduce the number of detections. Finally, a CNN classifier is applied to remove the final false positives to generate the hand detections.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2018.8615781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
The egocentric perspective is a recent perspective brought by new devices like the GoPro and Google Glass, which are becoming more available to the public. The hands are the most consistent objects in the egocentric perspective and they can represent more information about people and their activities, but the nature of the perspective and the ever changing shape of the hands makes them difficult to detect. Previous work has focused on indoor environments or controlled data since it brings simpler ways to approach it, but in this work we use data with changing background and variable illumination, which makes it more challenging. We use a Deformable Part Model based approach to generate hand proposals since it can handle the many gestures the hand can adopt and rivals other techniques on locating the hands while reducing the number of proposals. We also use the location where the hands appear and size in the image to reduce the number of detections. Finally, a CNN classifier is applied to remove the final false positives to generate the hand detections.