Large-scale Video Panoptic Segmentation in the Wild: A Benchmark

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2022-06-01 DOI:10.1109/CVPR52688.2022.02036

Jiaxu Miao, Xiaohan Wang, Yu Wu, Wei Li, Xu Zhang, Yunchao Wei, Yi Yang

引用次数: 37

Abstract

In this paper, we present a new large-scale dataset for the video panoptic segmentation task, which aims to assign semantic classes and track identities to all pixels in a video. As the ground truth for this task is difficult to annotate, previous datasets for video panoptic segmentation are limited by either small scales or the number of scenes. In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3,536 videos and 84,750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories. To the best of our knowledge, our VIPSeg is the first attempt to tackle the challenging video panoptic segmentation task in the wild by considering diverse scenarios. Based on VIPSeg, we evaluate existing video panoptic segmentation approaches and propose an efficient and effective clip-based baseline method to analyze our VIPSeg dataset. Our dataset is available at https://github.com/VIPSeg-Dataset/VIPSeg-Dataset/.

查看原文本刊更多论文

野外大规模视频全光学分割:一个基准

在本文中，我们提出了一个新的用于视频全光分割任务的大规模数据集，旨在为视频中的所有像素分配语义类和跟踪身份。由于该任务的基础事实难以注释，以前用于视频全景分割的数据集受到小尺度或场景数量的限制。相比之下，我们的大规模视频全景分割(VIPSeg)数据集提供了3,536个视频和84,750帧的像素级全景注释，涵盖了广泛的现实世界场景和类别。据我们所知，我们的VIPSeg是第一次尝试通过考虑不同的场景来解决具有挑战性的视频全光分割任务。基于VIPSeg，我们评估了现有的视频全光学分割方法，并提出了一种高效的基于片段的基线方法来分析我们的VIPSeg数据集。我们的数据集可以在https://github.com/VIPSeg-Dataset/VIPSeg-Dataset/上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量