太阳和天空下的阴影检测视频实例。

Zhenghao Xing;Tianyu Wang;Xiaowei Hu;Haoran Wu;Chi-Wing Fu;Pheng-Ann Heng
{"title":"太阳和天空下的阴影检测视频实例。","authors":"Zhenghao Xing;Tianyu Wang;Xiaowei Hu;Haoran Wu;Chi-Wing Fu;Pheng-Ann Heng","doi":"10.1109/TIP.2024.3468877","DOIUrl":null,"url":null,"abstract":"Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"5715-5726"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Video Instance Shadow Detection Under the Sun and Sky\",\"authors\":\"Zhenghao Xing;Tianyu Wang;Xiaowei Hu;Haoran Wu;Chi-Wing Fu;Pheng-Ann Heng\",\"doi\":\"10.1109/TIP.2024.3468877\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"33 \",\"pages\":\"5715-5726\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10704578/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10704578/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

阴影实例检测对照片编辑和光照方向估计等应用至关重要,在预测阴影实例、物体实例及其关联方面取得了重大进展。将这一任务扩展到视频中,在注释不同的视频数据以及解决遮挡和关联中的暂时消失所带来的复杂性方面提出了挑战。为了应对这些挑战,我们推出了 ViShadow,这是一种半监督式视频实例阴影检测框架,可同时利用已标注图像数据和未标注视频数据进行训练。ViShadow 采用两阶段训练管道:第一阶段利用标记图像数据,通过跨帧配对的对比学习来识别阴影和物体实例。第二阶段利用未标记的视频,结合相关的周期一致性损失来增强跟踪能力。此外,还引入了一种检索机制来管理临时消失,确保跟踪的连续性。SOBA-VID 数据集包括未标记的训练视频和已标记的测试视频以及 SOAP-VID 指标,用于对 VISD 解决方案进行定量评估。通过各种视频级应用,如视频内画、实例克隆、阴影编辑和文本指示阴影对象操作,进一步证明了 ViShadow 的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Video Instance Shadow Detection Under the Sun and Sky
Instance shadow detection, crucial for applications such as photo editing and light direction estimation, has undergone significant advancements in predicting shadow instances, object instances, and their associations. The extension of this task to videos presents challenges in annotating diverse video data and addressing complexities arising from occlusion and temporary disappearances within associations. In response to these challenges, we introduce ViShadow, a semi-supervised video instance shadow detection framework that leverages both labeled image data and unlabeled video data for training. ViShadow features a two-stage training pipeline: the first stage, utilizing labeled image data, identifies shadow and object instances through contrastive learning for cross-frame pairing. The second stage employs unlabeled videos, incorporating an associated cycle consistency loss to enhance tracking ability. A retrieval mechanism is introduced to manage temporary disappearances, ensuring tracking continuity. The SOBA-VID dataset, comprising unlabeled training videos and labeled testing videos, along with the SOAP-VID metric, is introduced for the quantitative evaluation of VISD solutions. The effectiveness of ViShadow is further demonstrated through various video-level applications such as video inpainting, instance cloning, shadow editing, and text-instructed shadow-object manipulation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信