2023 IEEE International Conference on Multimedia and Expo (ICME)最新文献_第4页

Synthetic Feature Assessment for Zero-Shot Object Detection 零射击目标检测的综合特征评估

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00083

Xinmiao Dai, Chong Wang, Haohe Li, Sunqi Lin, Lining Dong, Jiafei Wu, Jun Wang

引用次数: 0

A Multi-View Co-Learning Method for Multimodal Sentiment Analysis 面向多模态情感分析的多视图共同学习方法

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00238

Wenxiu Geng, Yulong Bian, Xiangxian Li

引用次数: 0

Domain-Invariant Feature Learning for General Face Forgery Detection 通用人脸伪造检测的域不变特征学习

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00396

Jian Zhang, J. Ni

引用次数: 0

Fixing Domain Bias for Generalized Deepfake Detection 修正广义深度假检测的域偏置

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00380

Yuzhe Mao, Weike You, Linna Zhou, Zhigao Lu

引用次数: 0

Is Really Correlation Information Represented Well in Self-Attention for Skeleton-based Action Recognition? 相关性信息在基于骨架的动作识别中的自我注意表现得好吗?

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00139

Wentian Xin, Hongkai Lin, Ruyi Liu, Yi Liu, Q. Miao

{"title":"Is Really Correlation Information Represented Well in Self-Attention for Skeleton-based Action Recognition?","authors":"Wentian Xin, Hongkai Lin, Ruyi Liu, Yi Liu, Q. Miao","doi":"10.1109/ICME55011.2023.00139","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00139","url":null,"abstract":"Transformer has shown significant advantages by various vision tasks. However, the lack of representation of correlation information about data properties makes it difficult to match the excellent results consistent with GCNs in skeleton-based action recognition. In this paper, we propose a Topology and Frames-guided Spatial-Temporal ConvFormer Network (TF-STCFormer), which is well suited for dynamically extracting topological and inter-frame uniqueness & co-occurrence information. Three essential components make up the proposed framework: (1) Grouped Physical-guided Spatial Transformer for focusing on learning essential spatial features and physical topology. (2) Global and Focal Temporal Transformer for promoting the relationship of different joints in consecutive frames and improving the representation of discriminative key-frames. (3) Grouped Dilation Temporal Convolution for connecting the intermediate output obtained by the previous transformers in the feature channels of different dilation. Experiments on four standard datasets (NTU RGB+D, NTU RGB+D 120, NW-UCLA, and UAV-Human) demonstrate that our approach prominently outperforms state-of-the-art methods on all benchmarks.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"120 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133686514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Peer Upsampled Transform Domain Prediction for G-PCC G-PCC的对等上采样变换域预测

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00127

Wenyi Wang, Yingzhan Xu, Kai Zhang, Li Zhang

引用次数: 2

Microimage-based Two-step Search For Plenoptic 2.0 Video Coding 基于微图像的两步搜索pleenoptic 2.0视频编码

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00437

Yuqing Yang, Xin Jin, Kedeng Tong, Chen Wang, Haitian Huang

引用次数: 1

A Content-based Viewport Prediction Framework for 360° Video Using Personalized Federated Learning and Fusion Techniques 使用个性化联邦学习和融合技术的360°视频基于内容的视口预测框架

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00118

Mehdi Setayesh, V. Wong

引用次数: 0

Multi-Level Feature-Guided Stereoscopic Video Quality Assessment Based on Transformer and Convolutional Neural Network 基于变压器和卷积神经网络的多层次特征引导立体视频质量评估

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00428

Yuan Chen, Sumei Li

{"title":"Multi-Level Feature-Guided Stereoscopic Video Quality Assessment Based on Transformer and Convolutional Neural Network","authors":"Yuan Chen, Sumei Li","doi":"10.1109/ICME55011.2023.00428","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00428","url":null,"abstract":"Stereoscopic video (3D video) has been increasingly applied in industry and entertainment. And the research of stereoscopic video quality assessment (SVQA) has become very important for promoting the development of stereoscopic video system. Many CNN-based models have emerged for SVQA task. However, these methods ignore the significance of the global information of the video frames for quality perception. In this paper, we propose a multi-level feature-fusion model based on Transformer and convolutional neural network (MFFTCNet) to assess the perceptual quality of the stereoscopic video. Firstly, we use global information from Transformer to guide local information from convolutional neural network (CNN). Moreover, we utilize low-level features in the CNN branch to guide high-level features. Besides, considering the binocular rivalry effect in the human vision system (HVS), we use 3D convolution to achieve rivalry fusion of binocular features. The proposed method is tested on two public stereoscopic video quality datasets. The result shows that this method correlates highly with human visual perception and outperforms state-of-the-art (SOTA) methods by a significant margin.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127521844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Hidden Follower Detection via Refined Gaze and Walking State Estimation 基于改进凝视和行走状态估计的隐藏追随者检测

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00356

Yaxi Chen, Ruimin Hu, Danni Xu, Zheng Wang, Linbo Luo, Dengshi Li

引用次数: 0