基于时间扩张的多层次关注对抗学习在无监督视频域自适应中的应用

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Pub Date : 2022-01-01 DOI:10.1109/WACV51458.2022.00085

Peipeng Chen, Yuan Gao, A. J. Ma

{"title":"基于时间扩张的多层次关注对抗学习在无监督视频域自适应中的应用","authors":"Peipeng Chen, Yuan Gao, A. J. Ma","doi":"10.1109/WACV51458.2022.00085","DOIUrl":null,"url":null,"abstract":"Most existing works on unsupervised video domain adaptation attempt to mitigate the distribution gap across domains in frame and video levels. Such two-level distribution alignment approach may suffer from the problems of insufficient alignment for complex video data and misalignment along the temporal dimension. To address these issues, we develop a novel framework of Multi-level Attentive Adversarial Learning with Temporal Dilation (MA2L- TD). Given frame-level features as input, multi-level temporal features are generated and multiple domain discriminators are individually trained by adversarial learning for them. For better distribution alignment, level-wise attention weights are calculated by the degree of domain confusion in each level. To mitigate the negative effect of misalignment, features are aggregated with the attention mechanism determined by individual domain discriminators. Moreover, temporal dilation is designed for sequential non-repeatability to balance the computational efficiency and the possible number of levels. Extensive experimental results show that our proposed method outperforms the state of the art on four benchmark datasets.1","PeriodicalId":297092,"journal":{"name":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Multi-level Attentive Adversarial Learning with Temporal Dilation for Unsupervised Video Domain Adaptation\",\"authors\":\"Peipeng Chen, Yuan Gao, A. J. Ma\",\"doi\":\"10.1109/WACV51458.2022.00085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Most existing works on unsupervised video domain adaptation attempt to mitigate the distribution gap across domains in frame and video levels. Such two-level distribution alignment approach may suffer from the problems of insufficient alignment for complex video data and misalignment along the temporal dimension. To address these issues, we develop a novel framework of Multi-level Attentive Adversarial Learning with Temporal Dilation (MA2L- TD). Given frame-level features as input, multi-level temporal features are generated and multiple domain discriminators are individually trained by adversarial learning for them. For better distribution alignment, level-wise attention weights are calculated by the degree of domain confusion in each level. To mitigate the negative effect of misalignment, features are aggregated with the attention mechanism determined by individual domain discriminators. Moreover, temporal dilation is designed for sequential non-repeatability to balance the computational efficiency and the possible number of levels. Extensive experimental results show that our proposed method outperforms the state of the art on four benchmark datasets.1\",\"PeriodicalId\":297092,\"journal\":{\"name\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WACV51458.2022.00085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACV51458.2022.00085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

摘要

大多数现有的无监督视频域自适应研究都试图在帧和视频级别上减小域间的分布差距。这种两级分布对齐方法可能存在对复杂视频数据对齐不足和沿时间维度不对齐的问题。为了解决这些问题，我们开发了一个具有时间扩张的多层次注意对抗性学习(MA2L- TD)的新框架。在给定帧级特征作为输入的情况下，生成多层次时间特征，并通过对抗性学习分别训练多个域鉴别器。为了更好地对齐分布，根据每个级别的域混淆程度计算分层关注权重。为了减轻不对齐的负面影响，特征与单个域鉴别器决定的注意机制进行了聚合。此外，为了平衡计算效率和可能的层数，时间膨胀被设计为顺序不可重复性。大量的实验结果表明，我们提出的方法在四个基准数据集上都优于目前的方法

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-level Attentive Adversarial Learning with Temporal Dilation for Unsupervised Video Domain Adaptation

Most existing works on unsupervised video domain adaptation attempt to mitigate the distribution gap across domains in frame and video levels. Such two-level distribution alignment approach may suffer from the problems of insufficient alignment for complex video data and misalignment along the temporal dimension. To address these issues, we develop a novel framework of Multi-level Attentive Adversarial Learning with Temporal Dilation (MA2L- TD). Given frame-level features as input, multi-level temporal features are generated and multiple domain discriminators are individually trained by adversarial learning for them. For better distribution alignment, level-wise attention weights are calculated by the degree of domain confusion in each level. To mitigate the negative effect of misalignment, features are aggregated with the attention mechanism determined by individual domain discriminators. Moreover, temporal dilation is designed for sequential non-repeatability to balance the computational efficiency and the possible number of levels. Extensive experimental results show that our proposed method outperforms the state of the art on four benchmark datasets.1

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

自引率

0.00%

发文量