{"title":"3D先验的场景学习从单一视图","authors":"D. Rother, K. A. Patwardhan, I. Aganj, G. Sapiro","doi":"10.1109/CVPRW.2008.4563034","DOIUrl":null,"url":null,"abstract":"A framework for scene learning from a single still video camera is presented in this work. In particular, the camera transformation and the direction of the shadows are learned using information extracted from pedestrians walking in the scene. The proposed approach poses the scene learning estimation as a likelihood maximization problem, efficiently solved via factorization and dynamic programming, and amenable to an online implementation. We introduce a 3D prior to model the pedestrianpsilas appearance from any viewpoint, and learn it using a standard off-the-shelf consumer video camera and the Radon transform. This 3D prior or ldquoappearance modelrdquo is used to quantify the agreement between the tentative parameters and the actual video observations, taking into account not only the pixels occupied by the pedestrian, but also those occupied by the his shadows and/or reflections. The presentation of the framework is complemented with an example of a casual video scene showing the importance of the learned 3D pedestrian prior and the accuracy of the proposed approach.","PeriodicalId":102206,"journal":{"name":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"3D priors for scene learning from a single view\",\"authors\":\"D. Rother, K. A. Patwardhan, I. Aganj, G. Sapiro\",\"doi\":\"10.1109/CVPRW.2008.4563034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A framework for scene learning from a single still video camera is presented in this work. In particular, the camera transformation and the direction of the shadows are learned using information extracted from pedestrians walking in the scene. The proposed approach poses the scene learning estimation as a likelihood maximization problem, efficiently solved via factorization and dynamic programming, and amenable to an online implementation. We introduce a 3D prior to model the pedestrianpsilas appearance from any viewpoint, and learn it using a standard off-the-shelf consumer video camera and the Radon transform. This 3D prior or ldquoappearance modelrdquo is used to quantify the agreement between the tentative parameters and the actual video observations, taking into account not only the pixels occupied by the pedestrian, but also those occupied by the his shadows and/or reflections. The presentation of the framework is complemented with an example of a casual video scene showing the importance of the learned 3D pedestrian prior and the accuracy of the proposed approach.\",\"PeriodicalId\":102206,\"journal\":{\"name\":\"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-06-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CVPRW.2008.4563034\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPRW.2008.4563034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A framework for scene learning from a single still video camera is presented in this work. In particular, the camera transformation and the direction of the shadows are learned using information extracted from pedestrians walking in the scene. The proposed approach poses the scene learning estimation as a likelihood maximization problem, efficiently solved via factorization and dynamic programming, and amenable to an online implementation. We introduce a 3D prior to model the pedestrianpsilas appearance from any viewpoint, and learn it using a standard off-the-shelf consumer video camera and the Radon transform. This 3D prior or ldquoappearance modelrdquo is used to quantify the agreement between the tentative parameters and the actual video observations, taking into account not only the pixels occupied by the pedestrian, but also those occupied by the his shadows and/or reflections. The presentation of the framework is complemented with an example of a casual video scene showing the importance of the learned 3D pedestrian prior and the accuracy of the proposed approach.