Dana36: A Multi-camera Image Dataset for Object Identification in Surveillance Scenarios

2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance Pub Date : 2012-09-18 DOI:10.1109/AVSS.2012.33

J. Pers, Vildana Sulic Kenk, Rok Mandeljc, M. Kristan, S. Kovacic

{"title":"Dana36: A Multi-camera Image Dataset for Object Identification in Surveillance Scenarios","authors":"J. Pers, Vildana Sulic Kenk, Rok Mandeljc, M. Kristan, S. Kovacic","doi":"10.1109/AVSS.2012.33","DOIUrl":null,"url":null,"abstract":"We present a novel dataset for evaluation of object matching and recognition methods in surveillance scenarios. Dataset consists of more than 23,000 images, depicting 15 persons and nine vehicles. A ground truth data - the identity of each person or vehicle - is provided, along with the coordinates of the bounding box in the full camera image. The dataset was acquired from 36 stationary camera views using a variety of surveillance cameras with resolutions ranging from standard VGA to three megapixel. 27 cameras observed the persons and vehicles in an outdoor environment, while the remaining nine observed the same persons indoors. The activity of persons was planned in advance, they drive the cars to the parking lot, exit the cars and walk around the building, through the main entrance, and up the stairs, towards the first floor of the building. The intended use of the dataset is performance evaluation of computer vision methods that aim to (re)identify people and objects from many different viewpoints in different environments and under variable conditions. Due to variety of camera locations, vantage points and resolutions, the dataset provides means to adjust the difficulty of the identification task in a controlled and documented manner. An interface for easy use of dataset within Matlab is provided as well, and the data is complemented by baseline results using a basic color histogram-based descriptor. While the cropped images of persons and vehicles represent the primary data in our dataset, we also provide full-frame images and a set of tracklets for each object as a courtesy to the dataset users.","PeriodicalId":275325,"journal":{"name":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AVSS.2012.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

We present a novel dataset for evaluation of object matching and recognition methods in surveillance scenarios. Dataset consists of more than 23,000 images, depicting 15 persons and nine vehicles. A ground truth data - the identity of each person or vehicle - is provided, along with the coordinates of the bounding box in the full camera image. The dataset was acquired from 36 stationary camera views using a variety of surveillance cameras with resolutions ranging from standard VGA to three megapixel. 27 cameras observed the persons and vehicles in an outdoor environment, while the remaining nine observed the same persons indoors. The activity of persons was planned in advance, they drive the cars to the parking lot, exit the cars and walk around the building, through the main entrance, and up the stairs, towards the first floor of the building. The intended use of the dataset is performance evaluation of computer vision methods that aim to (re)identify people and objects from many different viewpoints in different environments and under variable conditions. Due to variety of camera locations, vantage points and resolutions, the dataset provides means to adjust the difficulty of the identification task in a controlled and documented manner. An interface for easy use of dataset within Matlab is provided as well, and the data is complemented by baseline results using a basic color histogram-based descriptor. While the cropped images of persons and vehicles represent the primary data in our dataset, we also provide full-frame images and a set of tracklets for each object as a courtesy to the dataset users.

查看原文本刊更多论文

Dana36:用于监控场景中目标识别的多相机图像数据集

我们提出了一个新的数据集，用于评估监视场景中的目标匹配和识别方法。数据集由23000多张图像组成，描绘了15个人和9辆车。提供了地面真实数据-每个人或车辆的身份-以及完整相机图像中边界框的坐标。该数据集是从36个固定摄像机视图中获得的，使用各种监控摄像机，分辨率从标准VGA到三百万像素不等。27台摄像机在室外环境中观察人员和车辆，而其余9台则在室内观察相同的人员。人们的活动是提前计划好的，他们把车开到停车场，下车，绕着大楼走，穿过主入口，爬上楼梯，走向大楼的一楼。该数据集的预期用途是对计算机视觉方法的性能评估，这些方法旨在从不同的环境和不同的条件下从许多不同的角度(重新)识别人和物体。由于相机位置、有利位置和分辨率的不同，数据集提供了以受控和记录的方式调整识别任务难度的方法。还提供了在Matlab中易于使用数据集的接口，并且使用基于基本颜色直方图的描述符来补充数据的基线结果。虽然人员和车辆的裁剪图像代表了我们数据集中的主要数据，但作为对数据集用户的礼貌，我们还为每个对象提供了全帧图像和一组轨迹图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance

自引率

0.00%

发文量