2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献_第8页

Multi-modal Emotion Reaction Intensity Estimation with Temporal Augmentation 基于时间增强的多模态情绪反应强度估计

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00613

Feng Qiu, Bowen Ma, Wei Zhang, Yu-qiong Ding

引用次数: 0

CAVLI - Using image associations to produce local concept-based explanations CAVLI -使用图像关联产生基于局部概念的解释

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00387

Pushkar Shukla, Sushil Bharati, Matthew A. Turk

{"title":"CAVLI - Using image associations to produce local concept-based explanations","authors":"Pushkar Shukla, Sushil Bharati, Matthew A. Turk","doi":"10.1109/CVPRW59228.2023.00387","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00387","url":null,"abstract":"While explainability is becoming increasingly crucial in computer vision and machine learning, producing explanations that can link decisions made by deep neural networks to concepts that are easily understood by humans still remains a challenge. To address this challenge, we propose a framework that produces local concept-based explanations for the classification decisions made by a deep neural network. Our framework is based on the intuition that if there is a high overlap between the regions of the image that are associated with a human-defined concept and regions of the image that are useful for decision-making, then the decision is highly dependent on the concept. Our proposed CAVLI framework combines a global approach (TCAV) with a local approach (LIME). To test the effectiveness of the approach, we conducted experiments on both the ImageNet and CelebA datasets. These experiments validate the ability of our framework to quantify the dependence of individual decisions on predefined concepts. By providing local concept-based explanations, our framework has the potential to improve the transparency and interpretability of deep neural networks in a variety of applications.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"20 13","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131505717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Three-Stage Framework with Reliable Sample Pool for Long-Tailed Classification 具有可靠样本池的三阶段长尾分类框架

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00054

Feng Cai, Keyu Wu, Haipeng Wang, Feng Wang

{"title":"A Three-Stage Framework with Reliable Sample Pool for Long-Tailed Classification","authors":"Feng Cai, Keyu Wu, Haipeng Wang, Feng Wang","doi":"10.1109/CVPRW59228.2023.00054","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00054","url":null,"abstract":"Synthetic Aperture Radar (SAR) imagery presents a promising solution for acquiring Earth surface information regardless of weather and daylight. However, the SAR dataset is commonly characterized by a long-tailed distribution due to the scarcity of samples from infrequent categories. In this work, we extend the problem to aerial view object classification in the SAR dataset with long-tailed distribution and a plethora of negative samples. Specifically, we propose a three-stage approach that employs a ResNet101 backbone for feature extraction, Class-balanced Focal Loss for class-level re-weighting, and reliable pseudo-labels generated through semi-supervised learning to improve model performance. Moreover, we introduce a Reliable Sample Pool (RSP) to enhance the model's confidence in predicting in-distribution data and mitigate the domain gap between the labeled and unlabeled sets. The proposed framework achieved a Top-1 Accuracy of 63.20% and an AUROC of 0.71 on the final dataset, winning the first place in track 1 of the PBVS 2023 Multi-modal Aerial View Object Classification Challenge.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131555834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ultra-Sonic Sensor based Object Detection for Autonomous Vehicles 基于超声波传感器的自动驾驶车辆目标检测

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00026

T. Nesti, Santhosh Boddana, Burhaneddin Yaman

{"title":"Ultra-Sonic Sensor based Object Detection for Autonomous Vehicles","authors":"T. Nesti, Santhosh Boddana, Burhaneddin Yaman","doi":"10.1109/CVPRW59228.2023.00026","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00026","url":null,"abstract":"Perception systems in autonomous vehicles (AV) have made significant advancements in recent years. Such systems leverage different sensing modalities such as cameras, LiDARs and Radars, and are powered by state-of-the-art deep learning algorithms. Ultrasonic sensors (USS) are a low-cost, durable and robust sensing technology that is particularly suitable for near-range detection in harsh weather conditions, but have received very limited attention in the perception literature. In this work, we present a novel USS-based object detection system that can enable accurate detection of objects in low-speed scenarios. The proposed pipeline involves four steps. First, the input USS data is transformed into a novel voxelized 3D point cloud leveraging the physics of USS. Next, multi-channels Bird Eye’s View (BEV) images are generated via projection operators. Later, the resolution of BEV images is enhanced by means of a rolling-window, vehicle movement-aware temporal aggregation process. Finally, the image-like data representation is used to train a deep neural network to detect and localize objects in the 2D plane. We present extensive experiments showing that the proposed framework achieves satisfactory performance across both classic and custom object detection metrics, thus bridging the usecase and literature visibility gap between USS and more established sensors.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115169688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improving language-supervised object detection with linguistic structure analysis 用语言结构分析改进语言监督目标检测

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00588

Arushi Rai, Adriana Kovashka

引用次数: 0

Scene Graph Driven Text-Prompt Generation for Image Inpainting 场景图形驱动文本提示生成图像绘制

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00083

Tripti Shukla, Paridhi Maheshwari, Rajhans Singh, Ankit Shukla, K. Kulkarni, P. Turaga

{"title":"Scene Graph Driven Text-Prompt Generation for Image Inpainting","authors":"Tripti Shukla, Paridhi Maheshwari, Rajhans Singh, Ankit Shukla, K. Kulkarni, P. Turaga","doi":"10.1109/CVPRW59228.2023.00083","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00083","url":null,"abstract":"Scene editing methods are undergoing a revolution, driven by text-to-image synthesis methods. Applications in media content generation have benefited from a careful set of engineered text prompts, that have been arrived at by the artists by trial and error. There is a growing need to better model prompt generation, for it to be useful for a broad range of consumer-grade applications. We propose a novel method for text prompt generation for the explicit purpose of consumer-grade image inpainting, i.e. insertion of new objects into missing regions in an image. Our approach leverages existing inter-object relationships to generate plausible textual descriptions for the missing object, that can then be used with any text-to-image generator. Given an image and a location where a new object is to be inserted, our approach first converts the given image to an intermediate scene graph. Then, we use graph convolutional networks to ‘expand’ the scene graph by predicting the identity and relationships of the new object to be inserted, with respect to the existing objects in the scene. The output of the expanded scene graph is cast into a textual description, which is then processed by a text-to-image generator, conditioned on the given image, to produce the final inpainted image. We conduct extensive experiments on the Visual Genome dataset, and show through qualitative and quantitative metrics that our method is superior to other methods.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128105477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Prioritised Moderation for Online Advertising 在线广告的优先审核

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00194

Phanideep Gampa, Akash Anil Valsangkar, Shailesh Choubey, Pooja A

{"title":"Prioritised Moderation for Online Advertising","authors":"Phanideep Gampa, Akash Anil Valsangkar, Shailesh Choubey, Pooja A","doi":"10.1109/CVPRW59228.2023.00194","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00194","url":null,"abstract":"Online advertisement industry aims to build a preference for a product over its competitors by making consumers aware of the product at internet scale. However, the ads that violate the applicable laws and location specific regulations can have serious business impact with legal implications. At the same time, customers are at risk of getting exposed to egregious ads resulting in a bad user experience. Due to the limited and costly human bandwidth, moderating ads at the industry scale is a challenging task. Typically at Amazon Advertising, we deal with ad moderation workflows where the ad distributions are skewed by non defective ads. It is desirable to increase the review time that the human moderators spend on moderating genuine defective ads. Hence prioritisation of deemed defective ads for human moderation is crucial for the effective utilisation of human bandwidth in the ad moderation workflow. To incorporate the business knowledge and to better deal with the possible overlaps between the policies, we formulate this as a policy gradient ranking algorithm with custom scalar rewards. Our extensive experiments demonstrate that these techniques show a substantial gain in number of defective ads caught against various tabular classification algorithms, resulting in effective utilisation of human moderation bandwidth.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128145271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Partial Fingerprint Recognition 鲁棒部分指纹识别

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00108

Yufei Zhang, Rui Zhao, Ziyi Zhao, Naveen Ramakrishnan, Manoj Aggarwal, G. Medioni, Q. Ji

{"title":"Robust Partial Fingerprint Recognition","authors":"Yufei Zhang, Rui Zhao, Ziyi Zhao, Naveen Ramakrishnan, Manoj Aggarwal, G. Medioni, Q. Ji","doi":"10.1109/CVPRW59228.2023.00108","DOIUrl":"https://doi.org/10.1109/CVPRW59228.2023.00108","url":null,"abstract":"Low quality capture and obstruction on fingers often result in partially visible fingerprint images, which imposes challenge for fingerprint recognition. In this work, motivated from the practical use cases, we first systematically studied different types of partial occlusion. Specifically, two major types of partial occlusion, including six granular types, and the corresponding methods to simulate each type for model evaluation and improvement were introduced. Second, we proposed a novel Robust Partial Fingerprint (RPF) recognition framework to mitigate the performance degradation due to occlusion. RPF effectively encodes the knowledge about partial fingerprints through occlusion-enhanced data augmentation, and explicitly captures the missing regions for robust feature extraction through occlusion-aware modeling. Finally, we demonstrated the effectiveness of RPF through extensive experiments. Particularly, baseline fingerprint recognition models can degrade the recognition accuracy measured in FRR @ FAR=0.1% from 14.67% to 17.57% at 10% occlusion ratio on the challenging NIST dataset, while RPF instead improves the recognition performance to 9.99% under the same occlusion ratio. Meanwhile, we presented a set of empirical analysis through visual explanation, matching score analysis, and uncertainty modeling, providing insights into the recognition model’s behavior and potential directions of enhancement.","PeriodicalId":355438,"journal":{"name":"2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132630537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Unified Approach to Facial Affect Analysis: the MAE-Face Visual Representation 面部情感分析的统一方法:MAE-Face视觉表征

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00630

Bowen Ma, Wei Zhang, Feng Qiu, Yu-qiong Ding

引用次数: 0

Enhanced Thermal-RGB Fusion for Robust Object Detection 增强热- rgb融合鲁棒目标检测

2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Pub Date : 2023-06-01 DOI: 10.1109/CVPRW59228.2023.00042

Wassim A. El Ahmar, Yahya Massoud, Dhanvin Kolhatkar, Hamzah Alghamdi, Mohammad Al Ja'afreh, R. Laganière, R. Hammoud

引用次数: 0