2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

筛选
英文 中文
Lifelong Learning of Task-Parameter Relationships for Knowledge Transfer 面向知识转移的任务-参数关系终身学习
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00251
S. Srivastava, Mohammad Yaqub, K. Nandakumar
{"title":"Lifelong Learning of Task-Parameter Relationships for Knowledge Transfer","authors":"S. Srivastava, Mohammad Yaqub, K. Nandakumar","doi":"10.1109/CVPRW59228.2023.00251","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00251","url":null,"abstract":"The ability to acquire new skills and knowledge continually is one of the defining qualities of the human brain, which is critically missing in most modern machine vision systems. In this work, we focus on knowledge transfer in the lifelong learning setting. We propose a lifelong learner that models the similarities between the optimal weight spaces of tasks and exploits this in order to enable knowledge transfer across tasks in a continual learning setting. To characterize the \"task-parameter relationships\", we propose a metric called adaptation rate integral (ARI), which measures the expected rate of adaptation over a finite number of steps for a (task, parameter) pair. These task-parameter relationships are learned using an auxiliary network trained on guided explorations of parameter space. The learned auxiliary network is then used to heuristically select the best parameter sets on seen tasks, which are consolidated using a hypernetwork. Given a new (unseen) task, knowledge transfer occurs through the selection of the most suitable parameter set from the hypernetwork that can be rapidly finetuned. We show that the proposed approach can improve knowledge transfer between tasks across standard benchmarks without any increase in overall model capacity, while naturally mitigating catastrophic forgetting.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134447577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning unbiased classifiers from biased data with meta-learning 用元学习从有偏数据中学习无偏分类器
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00005
R. Ragonesi, Pietro Morerio, Vittorio Murino
{"title":"Learning unbiased classifiers from biased data with meta-learning","authors":"R. Ragonesi, Pietro Morerio, Vittorio Murino","doi":"10.1109/CVPRW59228.2023.00005","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00005","url":null,"abstract":"It is well known that large deep architectures are powerful models when adequately trained, but may exhibit undesirable behavior leading to confident incorrect predictions, even when evaluated on slightly different test examples. Test data characterized by distribution shifts (from training data distribution), outliers, and adversarial samples are among the types of data affected by this problem. This situation worsens whenever data are biased, meaning that predictions are mostly based on spurious correlations present in the data. Unfortunately, since such correlations occur in the most of data, a model is prevented from correctly generalizing the considered classes. In this work, we tackle this problem from a meta-learning perspective. Considering the dataset as composed of unknown biased and unbiased samples, we first identify these two subsets by a pseudo-labeling algorithm, even if coarsely. Subsequently, we apply a bi-level optimization algorithm in which, in the inner loop, we look for the best parameters guiding the training of the two subsets, while in the outer loop, we train the final model taking benefit from augmented data generated using Mixup. Properly tuning the contributions of biased and unbiased data, together with the regularization introduced by the mixed data has proved to be an effective training strategy to learn unbiased models, showing superior generalization capabilities. Experimental results on synthetically and realistically biased datasets surpass state-of-the-art performance, as compared to existing methods.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134362701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network Specialization via Feature-level Knowledge Distillation 基于特征级知识蒸馏的网络专业化
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00339
Gaowen Liu, Yuzhang Shang, Yuguang Yao, R. Kompella
{"title":"Network Specialization via Feature-level Knowledge Distillation","authors":"Gaowen Liu, Yuzhang Shang, Yuguang Yao, R. Kompella","doi":"10.1109/CVPRW59228.2023.00339","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00339","url":null,"abstract":"State-of-the-art model specialization methods are mainly based on fine-tuning a pre-trained machine learning model to fit the specific needs of a particular task or application. Or by modifying the architecture of the model itself. However, these methods are not preferable in industrial applications because of the model’s large size and the complexity of the training process. In this paper, the difficulty of network specialization is attributed to overfitting caused by a lack of data, and we propose a novel model specialization method by Knowledge Distillation (SKD). The proposed methods merge transfer learning and model compression into one stage. Specifically, we distill and transfer knowledge at the feature map level, circumventing logit-level inconsistency between teacher and student. We empirically investigate and prove the effects of the three parts: Models can be specialized to customer use cases by knowledge distillation. Knowledge distillation can effectively regularize the knowledge transfer process to a smaller, task-specific model. Compared with classical methods such as training a model from scratch and model fine-tuning, our methods achieve comparable and much better results and have better training efficiency on the CIFAR-100 dataset for image classification tasks. This paper proves the great potential of model specialization by knowledge distillation.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131544167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wildlife Image Generation from Scene Graphs 从场景图生成野生动物图像
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00036
Yoshio Rubio, Marco A. Contreras-Cruz
{"title":"Wildlife Image Generation from Scene Graphs","authors":"Yoshio Rubio, Marco A. Contreras-Cruz","doi":"10.1109/CVPRW59228.2023.00036","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00036","url":null,"abstract":"Image generation from natural language descriptions is an exciting and challenging task in computer vision and natural language processing. In this work, we propose a novel method to generate synthetic images from scene graphs in the context of wildlife scenarios. Given a scene graph, our method uses a graph convolutional network to predict semantic layouts, and a semi-parametric approach based on a cascade refinement network to synthesize the final image. We test our approach on a subset of COCO dataset, which we call COCO-Wildlife. Our results outperform the baselines, both quantitatively and qualitatively, and the visual results show the ability of our approach to generate stunning images with natural interaction between the different objects. Our findings show the potential to expand the use case of the proposed method to other contexts where scale and realism is fundamental.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131728191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial-Temporal Graph-Based AU Relationship Learning for Facial Action Unit Detection 基于时空图的AU关系学习人脸动作单元检测
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00627
Zihan Wang, Siyang Song, Cheng Luo, Yuzhi Zhou, Shiling Wu, Weicheng Xie, Linlin Shen
{"title":"Spatial-Temporal Graph-Based AU Relationship Learning for Facial Action Unit Detection","authors":"Zihan Wang, Siyang Song, Cheng Luo, Yuzhi Zhou, Shiling Wu, Weicheng Xie, Linlin Shen","doi":"10.1109/CVPRW59228.2023.00627","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00627","url":null,"abstract":"This paper presents our Facial Action Units (AUs) detection submission to the fifth Affective Behavior Analysis in-the-wild Competition (ABAW). Our approach consists of three main modules: (i) a pre-trained facial representation encoder which produce a strong facial representation from each input face image in the input sequence; (ii) an AU-specific feature generator that specifically learns a set of AU features from each facial representation; and (iii) a spatio-temporal graph learning module that constructs a spatio-temporal graph representation. This graph representation describes AUs contained in all frames and predicts the occurrence of each AU based on both the modeled spatial information within the corresponding face and the learned temporal dynamics among frames. The experimental results show that our approach outperformed the baseline and the spatio-temporal graph representation learning allows our model to generate the best results among all ablated systems. Our model ranks at the 4th place in the AU recognition track at the 5th ABAW Competition. Our code is publicly available at https://github.com/wzh125/ABAW-5.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131769049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention Retractable Frequency Fusion Transformer for Image Super Resolution 用于图像超分辨率的可伸缩频率融合变压器
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00176
Qiangbo Zhu, Pengfei Li, Q. Li
{"title":"Attention Retractable Frequency Fusion Transformer for Image Super Resolution","authors":"Qiangbo Zhu, Pengfei Li, Q. Li","doi":"10.1109/CVPRW59228.2023.00176","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00176","url":null,"abstract":"Transformer-based image super-resolution (SR) has offered promising performance gains over the convolutional neural network-based one due to the adoption of parameter-independent global interactions. However, the existing Transformer-based methods are limited to obtaining enough global information due to the use of self-attention within non-overlapping windows, which restricts the receptive fields. To address this issue, we construct an effective image SR model based on the attention retractable frequency Transformer with the proposed spatial-frequency fusion block. In our method, the spatial-frequency fusion block is designed to strengthen the representation ability of the Transformer and extend the receptive field to the whole image to improve the quality of SR results. Furthermore, a progressive training strategy is proposed to use image patches with different sizes to train our SR model to further improve the SR performance. The experimental results demonstrate that our proposed method outperforms the state-of-the-art methods over various benchmark datasets, both objectively and subjectively.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132605882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-View Body Image-Based Prediction of Body Mass Index and Various Body Part Sizes 基于多视图身体图像的身体质量指数和不同身体部位尺寸预测
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00642
Seunghyun Kim, Kunyoung Lee, E. Lee
{"title":"Multi-View Body Image-Based Prediction of Body Mass Index and Various Body Part Sizes","authors":"Seunghyun Kim, Kunyoung Lee, E. Lee","doi":"10.1109/CVPRW59228.2023.00642","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00642","url":null,"abstract":"This paper proposes a novel model for predicting body mass index and various body part sizes using front, side, and back body images. The model is trained on a large dataset of labeled images. The results show that the model can accurately predict body mass index and various body part sizes such as chest, waist, hip, thigh, forearm, and shoulder width. One significant advantage of the proposed model is that it can use multiple views of the body to achieve more accurate predictions, overcoming the limitations of models that only used a single image. The model also does not require complex pre-processing or feature extraction, making it straightforward to apply in practice. We also explore the impact of different environmental factors, such as clothing and posture, on the model's performance. The findings show that the model is relatively insensitive to posture but is more sensitive to clothing, emphasizing the importance of controlling for clothing when using this model. Overall, the proposed model represents a step forward in predicting body mass index and various body part sizes from images. The model's accuracy, convenience, and ability to use multiple views of the body make it a promising tool for a wide range of applications. The proposed method is expected to be utilized as a parameter for accurate sensing of various vision-based non-contact biomarkers, in addition to body mass index inference.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132851231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymmetric Color Transfer with Consistent Modality Learning 不对称色彩传递与一致情态学习
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00282
Kai Zheng, Jie Huang, Man Zhou, Fengmei Zhao
{"title":"Asymmetric Color Transfer with Consistent Modality Learning","authors":"Kai Zheng, Jie Huang, Man Zhou, Fengmei Zhao","doi":"10.1109/CVPRW59228.2023.00282","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00282","url":null,"abstract":"The mono-color dual-lens system widely exists in the smartphone that captures asymmetric stereo image pairs, including high-resolution (HR) monochrome images and low-resolution (LR) color images. Asymmetric color transfer aims to reconstruct an HR color image by transferring the color information of the LR color image to the HR monochrome image. However, the inconsistency of spectral resolution and spatial resolution between stereo image pairs poses a challenge for establishing reliable stereo correspondence for precise color transfer. Previous works have not adequately addressed this issue. In this paper, we propose a dual-modality consistency learning framework to assist the establishment of reliable stereo correspondence. According to the complementarity of color and frequency information between stereo images, a dual-branch Stereo Information Complementary Module (SICM) is devised to perform the consistent modality learning in feature domain. Specifically, we meticulously design the stereo frequency and color modulation mechanism equipped in the SICM for capturing the information complementarity between dual-modal features. Furthermore, a parallax attention distillation is proposed to drive consistent modality learning for better stereo matching. Extensive experiments demonstrate that our model outperforms the state-of-the-art methods in the Flickr1024 dataset and has superior generalization ability over the KITTI dataset and real-world scenarios. The code is available at https://github.com/keviner1/SICNet.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132937390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Implications of Solution Patterns on Adversarial Robustness 解决模式对对抗鲁棒性的影响
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00237
Hengyue Liang, Buyun Liang, Ju Sun, Ying Cui, Tim Mitchell
{"title":"Implications of Solution Patterns on Adversarial Robustness","authors":"Hengyue Liang, Buyun Liang, Ju Sun, Ying Cui, Tim Mitchell","doi":"10.1109/CVPRW59228.2023.00237","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00237","url":null,"abstract":"Empirical robustness evaluation (RE) of deep learning models against adversarial perturbations involves solving non-trivial constrained optimization problems. Recent work has shown that these RE problems can be reliably solved by a general-purpose constrained-optimization solver, PyGRANSO with Constraint-Folding (PWCF). In this paper, we take advantage of PWCF and other existing numerical RE algorithms to explore distinct solution patterns in solving RE problems with various combinations of losses, perturbation models, and optimization algorithms. We then provide extensive discussions on the implications of these patterns on current robustness evaluation and adversarial training. A comprehensive version of this work can be found in [19].","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123177519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Robust Monocular 3D Human Motion with Lasso-Based Differential Kinematics 基于lasso差分运动学的鲁棒单目三维人体运动
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00702
Abed C. Malti
{"title":"Robust Monocular 3D Human Motion with Lasso-Based Differential Kinematics","authors":"Abed C. Malti","doi":"10.1109/CVPRW59228.2023.00702","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00702","url":null,"abstract":"This work introduces a method to robustly reconstruct 3D human motion from the motion of 2D skeletal landmarks. We propose to use a lasso (least absolute shrinkage and selection operator) optimization framework where the ℓ1-norm is computed over the vector of differential angular kinematics and the ℓ2-norm is computed over the differential 2D reprojection error. The ℓ1-norm term allows us to model sparse kinematic angular motion. The minimization of the reprojection error allows us to assume a bounded noise in both the kinematic model and the 2D landmark detection. This bound is controlled by a scale factor associated to the ℓ2-norm data term. A posteriori verification condition is provided to check whether or not the lasso formulation has allowed us to recover the ground-truth 3D human motion. Results on publicly available data demonstrates the effectiveness of the proposed approach on state-of-the-art methods. It shows that both sparsity and bounded noise assumptions encoded in lasso formulation are robust priors to safely recover 3D human motion.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127417876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信