2023 IEEE International Conference on Multimedia and Expo (ICME)最新文献_第7页

Weight-based Regularization for Improving Robustness in Image Classification 基于权重的正则化方法提高图像分类的鲁棒性

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00305

Hao Yang, Min Wang, Zhengfei Yu, Yun Zhou

引用次数: 0

Image Layer Modeling for Complex Document Layout Generation 复杂文档布局生成的图像图层建模

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00386

Tianlong Ma, Xingjiao Wu, Xiangcheng Du, Yanlong Wang, Cheng Jin

引用次数: 0

MSG-CAM:Multi-scale inputs make a better visual interpretation of CNN networks MSG-CAM:多尺度输入可以更好地对CNN网络进行视觉解读

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00061

Xiaohong Xiang, Fuyuan Zhang, Xin Deng, Ke Hu

引用次数: 0

Audio-Visual Generalized Zero-Shot Learning Based on Variational Information Bottleneck 基于变分信息瓶颈的视听广义零学习

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00084

Yapeng Li, Yong Luo, Bo Du

引用次数: 1

Region-Aware Semantic Consistency for Unsupervised Domain-Adaptive Semantic Segmentation 无监督域自适应语义分割的区域感知语义一致性

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00024

Jun Xie, Yixuan Zhou, Xing Xu, Guoqing Wang, Fumin Shen, Yang Yang

引用次数: 0

Multi-stream Adaptive Offloading of Joint Compressed Video Streams, Feature Streams, and Semantic Streams in Edge Computing Systems 边缘计算系统中联合压缩视频流、特征流和语义流的多流自适应卸载

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00175

Dieli Hu, Wen Ji, Zhi Wang

{"title":"Multi-stream Adaptive Offloading of Joint Compressed Video Streams, Feature Streams, and Semantic Streams in Edge Computing Systems","authors":"Dieli Hu, Wen Ji, Zhi Wang","doi":"10.1109/ICME55011.2023.00175","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00175","url":null,"abstract":"Edge computing (EC) is a promising paradigm for serving latency-sensitive video applications. However, massive compressed video transmission and analysis require considerable bandwidth and computing resources, posing enormous challenges for current multimedia frameworks. Novel multi-stream frameworks that incorporate feature streams are more practical. The reason is that feature streams containing compact video frame feature data have a lower bitrate and better serve machine vision tasks. Nevertheless, feature extraction by devices increases the latency and energy consumption of local computing. Therefore, how to offload suitable streams according to video task requirements and system resources is a challenging issue. This paper studies EC-based multi-stream adaptive offloading. We model the multi-stream offloading and computation problem to maximize system utility by jointly optimizing offloading decisions, computation resource allocation, and video frame sampling rates. Frame sampling rates, processing latency, and energy consumption are considered in system utility modeling. The formulated optimization problem is a mixed-integer programming (MIP) problem. We propose an efficient algorithm to address this MIP problem. The proposed algorithm relies on the Hungarian algorithm and improved greedy Markov approximation. The simulation results validate our proposed algorithm’s superior performance.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126899192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Collaborative Spatial-Temporal Distillation for Efficient Video Deraining 高效视频训练的协同时空精馏

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00332

Yuzhang Hu, Minghao Liu, Wenhan Yang, Jiaying Liu, Zongming Guo

{"title":"Collaborative Spatial-Temporal Distillation for Efficient Video Deraining","authors":"Yuzhang Hu, Minghao Liu, Wenhan Yang, Jiaying Liu, Zongming Guo","doi":"10.1109/ICME55011.2023.00332","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00332","url":null,"abstract":"In this paper, we propose a novel knowledge distillation framework to improve the efficiency of deep networks for video deraining. The knowledge is transferred from a large-scale powerful teacher network to a compact efficient student network via the proposed collaborative spatial-temporal distillation framework. The framework is equipped with three collaboration schemes of different granularities that make use of spatial-temporal redundancy in a complementary way for better distillation performance. First, the spatial alignment module applies distillation constraints at different spatial scales to achieve better scale invariance in transferred knowledge. Second, the temporal alignment module traces both temporal status between teacher and student separately and collaboratively, to comprehensively utilize inter-frame information. Third, these two alignment modules interact through a spatial-temporal adaptor, where spatial-temporal knowledge is transferred in a unified framework. Extensive experiments demonstrate the superiority of our distillation framework as well as the effectiveness of each module. Our code is available at: https://github.com/HuYuzhang/Knowledge-Distillation.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126255030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining DFCP:通过对比预训练的少镜头深度假检测

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00393

Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao

{"title":"DFCP: Few-Shot DeepFake Detection via Contrastive Pretraining","authors":"Bojing Zou, Chao Yang, Jiazhi Guan, Chengbin Quan, Youjian Zhao","doi":"10.1109/ICME55011.2023.00393","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00393","url":null,"abstract":"Abuses of forgery techniques have created a considerable problem of misinformation on social media. Although scholars devote many efforts to face forgery detection (a.k.a DeepFake detection) and achieve some results, two issues still hinder the practical application. 1) Most detectors do not generalize well to unseen datasets. 2) In a supervised manner, most previous works require a considerable amount of manually labeled data. To address these problems, we propose a simple contrastive pertaining framework for DeepFake detection (DFCP), which works in a finetuning-after-pretraining manner, and requires only a few labels (5%). Specifically, we design a two-stream framework to simultaneously learn high-frequency texture features and high-level semantics information during pretraining. In addition, a video-based frame sampling strategy is proposed to mitigate potential noise data in the instance-discriminative contrastive learning to achieve better performance. Experimental results on several downstream datasets show the state-of-the-art performance of the proposed DFCP, which works at frame-level (w/o temporal reasoning) with high efficiency but outperforms video-level methods.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123016843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

E2: Entropy Discrimination and Energy Optimization for Source-free Universal Domain Adaptation 2 .无源通用域自适应的熵判别和能量优化

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00460

Meng Shen, A. J. Ma, PongChi Yuen

{"title":"E2: Entropy Discrimination and Energy Optimization for Source-free Universal Domain Adaptation","authors":"Meng Shen, A. J. Ma, PongChi Yuen","doi":"10.1109/ICME55011.2023.00460","DOIUrl":"https://doi.org/10.1109/ICME55011.2023.00460","url":null,"abstract":"Universal domain adaptation (UniDA) transfers knowledge under both distribution and category shifts. Most UniDA methods accessible to source-domain data during model adaptation may result in privacy policy violation and source-data transfer inefficiency. To address this issue, we propose a novel source-free UniDA method coupling confidence-guided entropy discrimination and likelihood-induced energy optimization. The entropy-based separation of target-known and unknown classes is too conservative for known-class prediction. Thus, we derive the confidence-guided entropy by scaling the normalized prediction score with the known-class confidence, that more known-class samples are correctly predicted. Due to difficult estimation of the marginal distribution without source-domain data, we constrain the target-domain marginal distribution by maximizing (minimizing) the known (unknown)-class likelihood, which equals free energy optimization. Theoretically, the overall optimization amounts to decreasing and increasing internal energy of known and unknown classes in physics, respectively. Extensive experiments demonstrate the superiority of the proposed method.","PeriodicalId":321830,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo (ICME)","volume":"10 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120921294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ABTD-Net: Autonomous Baggage Threat Detection Networks for X-ray Images ABTD-Net:用于x射线图像的自主行李威胁检测网络

2023 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2023-07-01 DOI: 10.1109/ICME55011.2023.00214

Wen Liu, Degang Sun, Yan Wang, Zhongyuan Chen, Xinbo Han, Haitian Yang

引用次数: 1