Sujitha Martin, Eshed Ohn-Bar, Ashish Tawari, M. Trivedi
{"title":"Understanding head and hand activities and coordination in naturalistic driving videos","authors":"Sujitha Martin, Eshed Ohn-Bar, Ashish Tawari, M. Trivedi","doi":"10.1109/IVS.2014.6856610","DOIUrl":null,"url":null,"abstract":"In this work, we propose a vision-based analysis framework for recognizing in-vehicle activities such as interactions with the steering wheel, the instrument cluster and the gear. The framework leverages two views for activity analysis, a camera looking at the driver's hand and another looking at the driver's head. The techniques proposed can be used by researchers in order to extract `mid-level' information from video, which is information that represents some semantic understanding of the scene but may still require an expert in order to distinguish difficult cases or leverage the cues to perform drive analysis. Unlike such information, `low-level' video is large in quantity and can't be used unless processed entirely by an expert. This work can apply to minimizing manual labor so that researchers may better benefit from the accessibility of the data and provide them with the ability to perform larger-scaled studies.","PeriodicalId":254500,"journal":{"name":"2014 IEEE Intelligent Vehicles Symposium Proceedings","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"40","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE Intelligent Vehicles Symposium Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IVS.2014.6856610","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 40
Abstract
In this work, we propose a vision-based analysis framework for recognizing in-vehicle activities such as interactions with the steering wheel, the instrument cluster and the gear. The framework leverages two views for activity analysis, a camera looking at the driver's hand and another looking at the driver's head. The techniques proposed can be used by researchers in order to extract `mid-level' information from video, which is information that represents some semantic understanding of the scene but may still require an expert in order to distinguish difficult cases or leverage the cues to perform drive analysis. Unlike such information, `low-level' video is large in quantity and can't be used unless processed entirely by an expert. This work can apply to minimizing manual labor so that researchers may better benefit from the accessibility of the data and provide them with the ability to perform larger-scaled studies.