Tiny-VPS: Tiny Video Panoptic Segmentation Standing on the Shoulder of Giant-VPS

IF 2.7 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Qingfeng Liu;Mostafa El-Khamy;Kee-Bong Song
{"title":"Tiny-VPS: Tiny Video Panoptic Segmentation Standing on the Shoulder of Giant-VPS","authors":"Qingfeng Liu;Mostafa El-Khamy;Kee-Bong Song","doi":"10.1109/OJSP.2025.3581840","DOIUrl":null,"url":null,"abstract":"Video Panoptic Segmentation (VPS) is the most challenging video segmentation task, as it requires accurate labeling of every pixel in each frame, as well as identifying the multiple instances and tracking them across frames. In this paper, we explore state-of-the-art solutions for VPS at both the giant model regime for offline or server processing and the tiny model regime for online or edge computing. We designed Giant-VPS which achieved the first place solution in the 2024 Pixel Level Video Understanding in the Wild (PVUW) challenge. Our Giant-VPS builds on top of MinVIS and deploys the DINOv2-giant vision foundation model with a carefully designed ViT (Vision Transformer) adapter. For mobile and edge devices, we designed the Tiny-VPS model and show that our novel ViT-adapter distillation from the Giant-VPS model can further improve the accuracy of Tiny-VPS. Our Tiny-VPS is the first, in the sub-20 GFLOPS regime, to achieve competitive accuracy on VPS and VSS (Video Semantic Segmentation) benchmarks.","PeriodicalId":73300,"journal":{"name":"IEEE open journal of signal processing","volume":"6 ","pages":"803-814"},"PeriodicalIF":2.7000,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11045393","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of signal processing","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/11045393/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Video Panoptic Segmentation (VPS) is the most challenging video segmentation task, as it requires accurate labeling of every pixel in each frame, as well as identifying the multiple instances and tracking them across frames. In this paper, we explore state-of-the-art solutions for VPS at both the giant model regime for offline or server processing and the tiny model regime for online or edge computing. We designed Giant-VPS which achieved the first place solution in the 2024 Pixel Level Video Understanding in the Wild (PVUW) challenge. Our Giant-VPS builds on top of MinVIS and deploys the DINOv2-giant vision foundation model with a carefully designed ViT (Vision Transformer) adapter. For mobile and edge devices, we designed the Tiny-VPS model and show that our novel ViT-adapter distillation from the Giant-VPS model can further improve the accuracy of Tiny-VPS. Our Tiny-VPS is the first, in the sub-20 GFLOPS regime, to achieve competitive accuracy on VPS and VSS (Video Semantic Segmentation) benchmarks.
Tiny- vps:站在Giant-VPS肩膀上的微型视频全景分割
视频全光学分割(VPS)是最具挑战性的视频分割任务,因为它需要准确标记每帧中的每个像素,以及识别多个实例并跨帧跟踪它们。在本文中,我们探索了最先进的VPS解决方案,包括用于离线或服务器处理的大型模型体系和用于在线或边缘计算的小型模型体系。我们设计的Giant-VPS在2024年像素级野外视频理解(PVUW)挑战赛中获得了第一名的解决方案。我们的Giant-VPS构建在MinVIS之上,并使用精心设计的ViT(视觉变压器)适配器部署DINOv2-giant视觉基础模型。对于移动和边缘设备,我们设计了Tiny-VPS模型,并表明我们从Giant-VPS模型中提取的新型vitv适配器可以进一步提高Tiny-VPS的精度。我们的Tiny-VPS是第一个在低于20 GFLOPS的情况下,在VPS和VSS(视频语义分割)基准上达到具有竞争力的准确性的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.30
自引率
0.00%
发文量
0
审稿时长
22 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信