Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu
{"title":"Open-Eye: An Open Platform to Study Human Performance on Identifying AI-Synthesized Faces","authors":"Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu","doi":"10.1109/MIPR54900.2022.00047","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00047","url":null,"abstract":"Al-synthesized faces are visually challenging to discern from real ones. They have been used as profile images for fake social media accounts, which leads to high negative social impacts. Although progress has been made in developing automatic methods to detect Al-synthesized faces. there is no open platform to study the human performance of Al-synthesized faces detection. In this work, we develop an online platform called Open-eye to study the human performance of Al-synthesized faces detection. We describe the design and workflow of the Open-eye in this paper.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"385 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124775858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Saeed Ranjbar Alvar, Korcan Uyanik, Ivan V. Baji'c
{"title":"License Plate Privacy in Collaborative Visual Analysis of Traffic Scenes","authors":"Saeed Ranjbar Alvar, Korcan Uyanik, Ivan V. Baji'c","doi":"10.1109/MIPR54900.2022.00060","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00060","url":null,"abstract":"Traffic scene analysis is important for emerging technologies such as smart traffic management and autonomous vehicles. However, such analysis also poses potential privacy threats. For example, a system that can recognize license plates may construct patterns of behavior of the corresponding vehicles' owners and use that for various illegal purposes. In this paper we present a system that enables traffic scene analysis while at the same time preserving license plate privacy. The system is based on a multi-task model whose latent space is selectively compressed depending on the amount of information the specific features carry about analysis tasks and private information. Effectiveness of the proposed method is illustrated by experiments on the Cityscapes dataset, for which we also provide license plate annotations.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132047481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongzheng Yuan, Samyak Rawlekar, S. Garg, E. Erkip, Yao Wang
{"title":"Feature Compression for Rate Constrained Object Detection on the Edge","authors":"Zhongzheng Yuan, Samyak Rawlekar, S. Garg, E. Erkip, Yao Wang","doi":"10.1109/MIPR54900.2022.00008","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00008","url":null,"abstract":"Recent advances in computer vision has led to a growth of interest in deploying visual analytics model on mobile devices. However, most mobile devices have limited computing power, which prohibits them from running large scale visual analytics neural networks. An emerging approach to solve this problem is to offload the computation of these neural networks to computing resources at an edge server. Efficient computation offloading requires optimizing the trade-off between multiple objectives including com-pressed data rate, analytics performance, and computation speed. In this work, we consider a “split computation” system to offload a part of the computation of the YOLO object detection model. We propose a learnable feature compression approach to compress the intermediate YOLO features with light-weight computation. We train the feature compression and decompression module together with the YOLO model to optimize the object detection accuracy under a rate constraint. Compared to baseline methods that apply either standard image compression or learned image compression at the mobile and perform image de-compression and YOLO at the edge, the proposed system achieves higher detection accuracy at the low to medium rate range. Furthermore, the proposed system requires sub-stantially lower computation time on the mobile device with CPU only.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128018475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sparse Tensor-based Point Cloud Attribute Compression","authors":"Jianqiang Wang, Zhan Ma","doi":"10.1109/MIPR54900.2022.00018","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00018","url":null,"abstract":"Surveillance videos can capture a variety of realistic events and also anomalies. Due to an increase in the crime rate in public areas, surveillance cameras are adopted in a very large number. But as these crimes/public disputes are rare to occur at a specific location, human monitors are idle most of the time. Hence, there is a justified need to develop intelligent systems for anomaly detection. There are several seminal deepneural architectures proposed in this field of anomaly detection ranging from using deep learning as a feature extraction tool to complete end-to-end deep-learning-based anomaly detection models. Any practical anomaly detection model must be generic in detecting a spectrum of anomalous events; however, several models can detect only specific types of anomalies. Further, several models are not amenable to distributed training over many machines on large streaming data, which is typical in a video surveillance system. In this paper, we discuss the techniques to detect anomalies in real-time by exploring recent architectures in the literature and analyze and explore ways we can improve the detection accuracy of the model. We propose a batching methodology that improves the existing model's area under the curve by 2%.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"57 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129473876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zinan Xiong, Chenxi Wang, Ying Li, Yan Luo, Yu Cao
{"title":"Swin-Pose: Swin Transformer Based Human Pose Estimation","authors":"Zinan Xiong, Chenxi Wang, Ying Li, Yan Luo, Yu Cao","doi":"10.1109/MIPR54900.2022.00048","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00048","url":null,"abstract":"Convolutional neural networks (CNNs) have been widely utilized in many computer vision tasks. However, CNNs have a fixed reception field and lack the ability of long-range perception, which is crucial to human pose estimation. Transformer architecture has been adopted to computer vision applications recently and is proven to be a highly effective architecture. We are interested in exploring its capability in human pose estimation, and thus propose a novel model based on transformer, enhanced with a feature pyramid fusion structure. More specifically, we use pre-trained Swin Transformer to extract features, and leverage a feature pyramid structure to extract and fuse feature maps from different stages. The experiment results of our study have demonstrated that the proposed transformer-based model can achieve better performance compared to the state-of-the-art CNN-based models.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127350704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attentive Graph Neural Networks for Few-Shot Learning","authors":"Hao Cheng, Joey Tianyi Zhou, Wee Peng Tay, B. Wen","doi":"10.1109/MIPR54900.2022.00033","DOIUrl":"https://doi.org/10.1109/MIPR54900.2022.00033","url":null,"abstract":"Graph Neural Networks (GNNs) have demonstrated superior performance in many challenging applications, including few-shot learning tasks. Despite its powerful capacity to learn and generalize a model from a few samples, GNN usually suffers from severe over-fitting and over-smoothing as the model becomes deep, which limits its scalability. In this work, we propose a novel Attentive GNN (AGNN) to tackle these challenges by incorporating a triple-attention mechanism, i.e., node self-attention, neighborhood attention, and layer memory attention. We explain why the proposed attentive modules can improve GNN for few-shot learning with theoretical analysis and illustrations. Extensive experiments demonstrate that the proposed AGNN model achieves promising results, compared to state-of-the-art GNN- and CNN-based methods for few-shot learning tasks, over the mini-ImageNet and tiered-ImageNet benchmarks under ConvNet-4 backbone with both inductive and transductive settings.","PeriodicalId":228640,"journal":{"name":"2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127092748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}