Zeyd Boukhers, Kimiaki Shirahama, Frédéric Li, M. Grzegorzek
{"title":"三维轨迹提取中的目标检测与深度估计","authors":"Zeyd Boukhers, Kimiaki Shirahama, Frédéric Li, M. Grzegorzek","doi":"10.1109/CBMI.2015.7153632","DOIUrl":null,"url":null,"abstract":"To detect an event which is defined by the interaction of objects in a video, it is necessary to capture their spatio-temporal relation. However, the video only displays the original 3D space which is projected onto a 2D image plane. This paper introduces a method which extracts 3D trajectories of objects from 2D videos. Each trajectory represents the transition of an object's positions in the 3D space. We extract such trajectories by combining object detection with depth estimation that estimates the depth information in 2D videos. The major problem for this is the inconsistency between object detection and depth estimation results. For example, significantly different depths may be estimated for the region of the same object, and an object region that is appropriately shaped by estimated depths may be missed. To overcome this, we first initialise the 3D position of an object by selecting the frame with the highest consistency between the object detection and depth estimation results. Then, we track the object in the 3D space using particle filter, where the 3D position of this object is modelled as a hidden state to generate its 2D visual appearance. Experimental results demonstrate the effectiveness of our method.","PeriodicalId":387496,"journal":{"name":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Object detection and depth estimation for 3D trajectory extraction\",\"authors\":\"Zeyd Boukhers, Kimiaki Shirahama, Frédéric Li, M. Grzegorzek\",\"doi\":\"10.1109/CBMI.2015.7153632\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To detect an event which is defined by the interaction of objects in a video, it is necessary to capture their spatio-temporal relation. However, the video only displays the original 3D space which is projected onto a 2D image plane. This paper introduces a method which extracts 3D trajectories of objects from 2D videos. Each trajectory represents the transition of an object's positions in the 3D space. We extract such trajectories by combining object detection with depth estimation that estimates the depth information in 2D videos. The major problem for this is the inconsistency between object detection and depth estimation results. For example, significantly different depths may be estimated for the region of the same object, and an object region that is appropriately shaped by estimated depths may be missed. To overcome this, we first initialise the 3D position of an object by selecting the frame with the highest consistency between the object detection and depth estimation results. Then, we track the object in the 3D space using particle filter, where the 3D position of this object is modelled as a hidden state to generate its 2D visual appearance. Experimental results demonstrate the effectiveness of our method.\",\"PeriodicalId\":387496,\"journal\":{\"name\":\"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CBMI.2015.7153632\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMI.2015.7153632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Object detection and depth estimation for 3D trajectory extraction
To detect an event which is defined by the interaction of objects in a video, it is necessary to capture their spatio-temporal relation. However, the video only displays the original 3D space which is projected onto a 2D image plane. This paper introduces a method which extracts 3D trajectories of objects from 2D videos. Each trajectory represents the transition of an object's positions in the 3D space. We extract such trajectories by combining object detection with depth estimation that estimates the depth information in 2D videos. The major problem for this is the inconsistency between object detection and depth estimation results. For example, significantly different depths may be estimated for the region of the same object, and an object region that is appropriately shaped by estimated depths may be missed. To overcome this, we first initialise the 3D position of an object by selecting the frame with the highest consistency between the object detection and depth estimation results. Then, we track the object in the 3D space using particle filter, where the 3D position of this object is modelled as a hidden state to generate its 2D visual appearance. Experimental results demonstrate the effectiveness of our method.