2023 IEEE International Conference on Multimedia and Expo (ICME)最新文献_第8页

Adaptive and Robust Fourier-Mellin-Based Image Watermarking for Social Networking Platforms 基于fourier - mellin的社交网络平台自适应鲁棒图像水印

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00483

Jinghong Xia, Hongxia Wang, S. Abdullahi, Heng Wang, Fei Zhang, Bingling Luo

引用次数: 0

Multi-Scale Query-Adaptive Convolution for Generalizable Person Re-Identification 基于多尺度查询自适应卷积的可泛化人物再识别

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00411

Kaixiang Chen, T. Gong, Liyan Zhang

{"title":"Multi-Scale Query-Adaptive Convolution for Generalizable Person Re-Identification","authors":"Kaixiang Chen, T. Gong, Liyan Zhang","doi":"10.1109/ICME55011.2023.00411","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00411","url":null,"abstract":"Domain Generalization in person re-identification (ReID) aims to learn a generalizable model from a single or multi-source domain that can be directly deployed to an unseen domain without fine-tuning. In this paper, we investigate the problem of single-source domain generalization in ReID. Recent research has gained remarkable progress by treating image matching as a search for local correspondences in feature maps. However, to ensure efficient matching, they usually adopt a pixel-wise matching approach, which is prone to be deviated by the identity-irrelevant patch features in the image, such as background patches. To address this problem, we propose the Multi-scale Query-Adaptive Convolution (QAConv-MS) framework. Specifically, we adopt a group of template kernels with different scales to extract local features of different receptive fields from the original feature maps and accordingly perform the local matching process. We also introduce a self-attention branch to extract global features from the feature map as complementary information for local features. Our approach achieves state-of-the-art performances on four large-scale datasets.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116159845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hierarchical Attention Learning for Multimodal Classification 多模态分类的层次注意学习

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00165

Xin Zou, Chang Tang, Wei Zhang, Kun Sun, Liangxiao Jiang

引用次数: 0

End-To-End Part-Level Action Parsing With Transformer 端到端的部分级动作解析与变压器

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00135

Xiaojia Chen, Xuanhan Wang, Beitao Chen, Lianli Gao

{"title":"End-To-End Part-Level Action Parsing With Transformer","authors":"Xiaojia Chen, Xuanhan Wang, Beitao Chen, Lianli Gao","doi":"10.1109/ICME55011.2023.00135","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00135","url":null,"abstract":"The divide-and-conquer strategy, which interprets part-level action parsing as a detect-then-parsing pipeline, has been widely used and become a general tool for part-level action understanding. However, existing methods that derive from the strategy usually suffer from either strong dependence on prior detection or high computational complexity. In this paper, we present the first fully end-to-end part-level action parsing framework with transformers, termed PATR. Unlike existing methods, our method regards part-level action parsing as a hierarchical set prediction problem and unifies person detection, body part detection, and action state recognition into one model. In PATR, predefined learnable representations, including general instance representations and general part representations, are guided to adaptively attend to the image features that are relevant to target body parts. Then, conditioning on corresponding learnable representations, attended image features are hierarchically decoded into corresponding semantics (i.e., person location, body part location, and action states for each body part). In this way, PATR relies on characteristics of body parts, instead of prior predictions like bounding boxes, to parse action states, thus removing the strong dependence between sub-tasks and eliminating the computational burdens caused by the multi-stage paradigm. Extensive experiments conducted on challenging Kinetic-TPS indicate that our method achieves very competitive results. In particular, our model outperforms all state-of-the-art part-level action parsing approaches by a margin, reaching around 3.8±2.0% Accp higher than previous methods. These findings indicate the potential of PATR to serve as a new baseline for part-level action parsing methods in the future. Our code and models are publicly available. 1","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125321253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

EvenFace: Deep Face Recognition with Uniform Distribution of Identities EvenFace:具有均匀身份分布的深度人脸识别

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00298

Pengfei Hu, Y. Tao, Qiqi Bao, Guijin Wang, Wenming Yang

引用次数: 0

Adaptive-Masking Policy with Deep Reinforcement Learning for Self-Supervised Medical Image Segmentation 基于深度强化学习的自监督医学图像分割自适应掩蔽策略

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00390

Gang Xu, Shengxin Wang, Thomas Lukasiewicz, Zhenghua Xu

引用次数: 0

Trajectory Alignment based Multi-Scaled Temporal Attention for Efficient Video Transformer 基于多尺度时间关注的高效视频变压器轨迹对准

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00244

Zao Zhang, Dong Yuan, Yu Zhang, Wei Bao

引用次数: 0

Video Snapshot Compressive Imaging via Optical Flow 基于光流的视频快照压缩成像

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00372

Zan Chen, Ran Li, Yongqiang Li, Yuanjing Feng

引用次数: 0

Few-Shot Object Detection via Back Propagation and Dynamic Learning 基于反向传播和动态学习的少镜头目标检测

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00493

Dianlong You, P. Wang, Y. Zhang, Ling Wang, Shunfu Jin

引用次数: 0

Self-Attention Prediction Correction with Channel Suppression for Weakly-Supervised Semantic Segmentation 基于信道抑制的弱监督语义分割自注意预测校正

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00150

Guoying Sun, Meng Yang

引用次数: 0