2018 IEEE International Conference on Multimedia and Expo (ICME)最新文献

筛选
英文 中文
Schmidt: Image Augmentation for Black-Box Adversarial Attack Schmidt:黑盒对抗性攻击的图像增强
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486449
Yucheng Shi, Yahong Han
{"title":"Schmidt: Image Augmentation for Black-Box Adversarial Attack","authors":"Yucheng Shi, Yahong Han","doi":"10.1109/ICME.2018.8486449","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486449","url":null,"abstract":"Despite achieving great success in multimedia analysis, especially in image recognition, deep neural networks (DNNs) can be easily fooled by maliciously crafted adversarial examples. Attacker who generates adversarial examples can even launch black-box adversarial attack by querying the target DNN model, without access to its internal structure or training set. In this work, we develop Schmidt Augmentation, an image augmentation method better probes decision boundaries of the black-box model. Schmidt Augmentation helps attackers achieve higher accuracy decrease on MNIST and CIFAR-10 datasets. We also shed light on the harshest circumstance that attacker only has access to samples of the target DNN by providing a labeling method based on semi-supervised learning instead of querying the target model.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116279285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multi-Grained Deep Feature Learning for Pedestrian Detection 用于行人检测的多粒度深度特征学习
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486498
Chunze Lin, Jiwen Lu, Jie Zhou
{"title":"Multi-Grained Deep Feature Learning for Pedestrian Detection","authors":"Chunze Lin, Jiwen Lu, Jie Zhou","doi":"10.1109/ICME.2018.8486498","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486498","url":null,"abstract":"In this paper, we address the challenging problem of detecting pedestrians who are heavily occluded or far from camera. Unlike most existing pedestrian detection methods which only use coarse-resolution feature maps with fixed receptive field, our approach exploits multi-grained deep features to make the detector more robust to visible parts of occluded pedestrians and small-size targets. Specifically, we jointly train a scale-aware network and a human parsing network in a semi-supervised manner with only bounding box annotation. We carefully design the scale-aware network to predict pedestrians of particular scales using most appropriate feature maps, by matching their receptive field with the target sizes. The human parsing network generates a fine-grained attentional map which helps guide the detector to focus on the visible parts of occluded pedestrians and small-size instances. Both networks are computed in parallel and form an unified single stage pedestrian detector, which assures a great trade-off between accuracy and speed. Experiments on two challenging benchmarks, Caltech and KITTI, demonstrate the effectiveness of our proposed approach, which in addition, executes 2× faster than competitive methods.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116511465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Adaptive Layerwise Quantization for Deep Neural Network Compression 深度神经网络压缩的自适应分层量化
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486500
Xiaotian Zhu, Wen-gang Zhou, Houqiang Li
{"title":"Adaptive Layerwise Quantization for Deep Neural Network Compression","authors":"Xiaotian Zhu, Wen-gang Zhou, Houqiang Li","doi":"10.1109/ICME.2018.8486500","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486500","url":null,"abstract":"Building efficient deep neural network models has become a hot-spot in recent years for deep learning research. Many works on network compression try to quantize a neural network with low bitwidth weights and activations. However, most of the existing network quantization methods set a fixed bitwidth for the whole network, which leads to large performance drop under high compression rate. In this paper we introduce an adaptive layerwise quantization method which quantizes the network with different bitwidth assigned to different layers. By using entropy of weights and activations as an importance indicator for each layer, we keep most of the layers under a high compression rate while a few most important layers receive more bit assignment. Experiments on CI-FAR10 and ImageNet2012 datasets demonstrate that our layerwise quantization could achieve smaller model size and less computation cost than the comparison fixed bitwidth methods with comparable accuracy, or higher accuracy with similar model size and computational complexity.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121531537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
FI-CAP: Robust Framework to Benchmark Head Pose Estimation in Challenging Environments FI-CAP:在具有挑战性的环境中对基准头部姿态估计的鲁棒框架
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486490
S. Jha, C. Busso
{"title":"FI-CAP: Robust Framework to Benchmark Head Pose Estimation in Challenging Environments","authors":"S. Jha, C. Busso","doi":"10.1109/ICME.2018.8486490","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486490","url":null,"abstract":"Head pose estimation is challenging in a naturalistic environment. To effectively train machine-learning algorithms, we need datasets with reliable ground truth labels from diverse environments. We present Fi-Cap, a helmet with fiducial markers designed for head pose estimation. The relative position and orientation of the tags from a reference camera can be automatically obtained from a subset of the tags. Placed at the back of the head, it provides a reference system without interfering with sensors that record frontal face. We quantify the performance of the Fi-Cap by (1) rendering the 3D model of the design, evaluating its accuracy under various rotation, image resolution and illumination conditions, and (2) comparing the predicted head pose with the location of the projected beam of a laser mounted on glasses worn by the subjects in controlled experiments conducted in our laboratory. Fi-Cap provides ideal benchmark information to evaluate automatic algorithms and alternative sensors for head pose estimation in a variety of challenging environments, including our target application for advanced driver assistance systems (ADAS).","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125624070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multi-Path Feature Fusion Network for Saliency Detection 显著性检测的多路径特征融合网络
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486571
Hengliang Zhu, Xin Tan, Zhiwen Shao, Yangyang Hao, Lizhuang Ma
{"title":"Multi-Path Feature Fusion Network for Saliency Detection","authors":"Hengliang Zhu, Xin Tan, Zhiwen Shao, Yangyang Hao, Lizhuang Ma","doi":"10.1109/ICME.2018.8486571","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486571","url":null,"abstract":"Recent saliency detection methods have made great progress with the fully convolutional network. However, we find that the saliency maps are usually coarse and fuzzy, especially near the boundary of salient object. To deal with this problem, in this paper, we exploit a multi-path feature fusion model for saliency detection. The proposed model is a fully convolutional network with raw images as input and saliency maps as output. In particular, we propose a multi-path fusion strategy for deriving the intrinsic features of salient objects. The structure has the ability of capturing the low-level visual features and generating the boundary-preserving saliency maps. Moreover, a coupled structure module is proposed in our model, which helps to explore the high-level semantic properties of salient objects. Extensive experiments on four public benchmarks indicate that our saliency model is effective and outperforms state-of-the-art methods.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125990401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Deep Learning Based Identity Verification in Renaissance Portraits 文艺复兴时期肖像中基于深度学习的身份验证
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486605
Akash Gupta, Niluthpol Chowdhury Mithun, C. Rudolph, A. Roy-Chowdhury
{"title":"Deep Learning Based Identity Verification in Renaissance Portraits","authors":"Akash Gupta, Niluthpol Chowdhury Mithun, C. Rudolph, A. Roy-Chowdhury","doi":"10.1109/ICME.2018.8486605","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486605","url":null,"abstract":"The identity of subjects in many portraits has been a matter of debate for art historians that relied upon subjective analysis of facial features to resolve ambiguity in sitter identity. Developing automated face verification technique has thus garnered interest to provide a quantitative way to reinforce the decision arrived at by the art historians. However, most existing works often fail to resolve ambiguities concerning the identity of the subjects due to significant variation in artistic styles and the limited availability and authenticity of art images. To these ends, we explore the use of deep Siamese Convolutional Neural Networks (CNN) to provide a measure of similarity between a pair of portraits. To mitigate limited training data issue, we employ CNN based style-transfer technique that creates several new images by recasting an art style to other images, keeping original image content unchanged. The resulting system thereby learns features which are discriminative and invariant to changes in artistic styles. Our approach shows significant improvement over baselines and state-of-the-art methods on several examples which are identified by art historians as being very challenging and controversial.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Depth Restoration with Normal-Guided Multiresolution Superpixel 用法线引导的多分辨率超像素深度恢复
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486583
Jinghui Qian, Jie Guo, Jingui Pan
{"title":"Depth Restoration with Normal-Guided Multiresolution Superpixel","authors":"Jinghui Qian, Jie Guo, Jingui Pan","doi":"10.1109/ICME.2018.8486583","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486583","url":null,"abstract":"In this paper, we propose a depth restoration method using a novel superpixel technique. Guided by a normal map reconstructed from the raw depth data, this technique over-segments RGB-D images into many small regions where their depth is assumed to be smooth. As the raw depth data is incomplete, we further introduce a depth confidence map to identify the regions which are more reliable. With the produced superpixels, we can restore the incomplete depth map using a per-superpixel linear regression. A multiresolution su-perpixel strategy is employed when some superpixels do not contain enough valid data. Experiments show that the proposed depth restoration method can effectively fill the wide gaps along depth discontinuities without blurring the object boundaries and the depth discontinuities.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131365063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Personalized Sequential Check-in Prediction: Beyond Geographical and Temporal Contexts 个性化顺序登记预测:超越地理和时间背景
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486476
Shenglin Zhao, Xixian Chen, Irwin King, Michael R. Lyu
{"title":"Personalized Sequential Check-in Prediction: Beyond Geographical and Temporal Contexts","authors":"Shenglin Zhao, Xixian Chen, Irwin King, Michael R. Lyu","doi":"10.1109/ICME.2018.8486476","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486476","url":null,"abstract":"Check-in prediction is an important task for location-based systems, which maps a noisy estimate of a user's current location to a semantically meaningful point-of-interest (POI), such as a restaurant or store. In this paper, we leverage the personalized preference and sequential check-in pattern to improve the traditional methods that base on the geographical and temporal contexts. In our approach, we propose a Gaussian mixture model and a histogram distribution estimation model to learn the contextual features from relevant spatial and temporal information, respectively. Furthermore, we employ user and POI embeddings to model the personalized preference and leverage a stacked Long-Short Term Memory (LSTM) model to learn the sequential check-in pattern. Combining the contextual features and the personalized sequential patterns together, we propose a wide and deep neural network for the check-in prediction task. Experimental evaluations on two real-life datasets demonstrate that our proposed method outperforms state-of-the-art models.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130767969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Hybrid Noise for LIC-Based Pencil Hatching Simulation 基于lic的铅笔孵化仿真的混合噪声
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486527
Qunye Kong, Y. Sheng, Guixu Zhang
{"title":"Hybrid Noise for LIC-Based Pencil Hatching Simulation","authors":"Qunye Kong, Y. Sheng, Guixu Zhang","doi":"10.1109/ICME.2018.8486527","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486527","url":null,"abstract":"Line Integral Convolution (LIC) has been widely adopted in pencil hatching generation, where an image degraded by random binary white noise (RBWN) is filtered by LIC along a priori vector field. Nonetheless, an RBWN degraded image through LIC produces hatching graduation only in terms of stroke intensity, and can neither produce hatching graduation explicitly in stroke density nor create visually clear hatching strokes while input pixel values are fairly low. In this paper we address these issues from a noise point of view by assessing several noise models and subsequently constructing a new noise model, called hybrid noise. The new noise model has been experimentally demonstrated in simulating hatching graduation in terms of both stroke intensity and stroke density with quantified graduality measurements. To oversee the effectiveness of hybrid noise we implement the whole pipeline of pencil drawing simulation and compare our results with those state of the art algorithms.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130737733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Temporal Attentive Network for Action Recognition 动作识别的时间注意网络
2018 IEEE International Conference on Multimedia and Expo (ICME) Pub Date : 2018-07-01 DOI: 10.1109/ICME.2018.8486452
Yemin Shi, Yonghong Tian, Tiejun Huang, Yaowei Wang
{"title":"Temporal Attentive Network for Action Recognition","authors":"Yemin Shi, Yonghong Tian, Tiejun Huang, Yaowei Wang","doi":"10.1109/ICME.2018.8486452","DOIUrl":"https://doi.org/10.1109/ICME.2018.8486452","url":null,"abstract":"In action recognition, one of the most important challenges is to jointly utilize the texture and motion information as well as capturing the long-term dependence of various common and action-specific postures. Motivated by this fact, this paper proposes Temporal Attentive Network (TAN) for action recognition. The key idea in TAN is that not all postures, each of which represented by a small collection of consecutive frames, contribute equally to the successful recognition of an action. As a result, TAN incorporates two separate spatial and temporal streams into one network. Information in the two streams is partially shared so that discriminative spatiotemporal features can be extracted to characterize various postures in an action. Moreover, a temporal attention mechanism is introduced in the form of Long-Short Term Memory (LSTM) network. With this mechanism, features from the action-specific postures can be emphasized, while common postures shared by many different actions will be ignored to some extent. By jointly using such spatial and temporal information as well as attentive cues in a single network, TAN achieves impressive performance on two public datasets, HMDB51 and UCF101, with accuracy scores of 72.5% and 94.1 %, respectively.","PeriodicalId":426613,"journal":{"name":"2018 IEEE International Conference on Multimedia and Expo (ICME)","volume":"236 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132908375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信