2019 IEEE Visual Communications and Image Processing (VCIP)最新文献

筛选
英文 中文
FICAL: Focal Inter-Class Angular Loss for Image Classification 图像分类的焦点类间角损失
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965889
Xinran Wei, Dongliang Chang, Jiyang Xie, Yixiao Zheng, Chen Gong, Chuang Zhang, Zhanyu Ma
{"title":"FICAL: Focal Inter-Class Angular Loss for Image Classification","authors":"Xinran Wei, Dongliang Chang, Jiyang Xie, Yixiao Zheng, Chen Gong, Chuang Zhang, Zhanyu Ma","doi":"10.1109/VCIP47243.2019.8965889","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965889","url":null,"abstract":"Convolutional Neural Networks (CNNs) have been successfully applied in various image analysis tasks and gradually become one of the most powerful machine learning approaches. In order to improve the capability of the model generalization and performance in image classification, a new trend is to learn more discriminative features via CNNs. The main contribution of this paper is to increase the angles between the categories to extract discriminative features and enlarge the inter-class variance. To this end, we propose a loss function named focal inter-class angular loss (FICAL) which introduces the confusion rate-weighted cosine distance as the similarity measurement between categories. This measurement is dynamically evaluated during each iteration to adapt the model. Compared with other loss functions, experimental results demonstrate that the proposed FICAL achieved best performance among the referred loss functions on two image classificaton datasets.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114654450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Multi-view Rank Pooling for 3D Object Recognition** 基于多视图秩池的三维物体识别**
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965979
Chaoda Zheng, Yong Xu, Ruotao Xu, Hongyu Chi, Yuhui Quan
{"title":"Multi-view Rank Pooling for 3D Object Recognition**","authors":"Chaoda Zheng, Yong Xu, Ruotao Xu, Hongyu Chi, Yuhui Quan","doi":"10.1109/VCIP47243.2019.8965979","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965979","url":null,"abstract":"3D shape recognition via deep learning is drawing more and more attention due to huge industry interests. As 3D deep learning methods emerged, the view-based approaches have gained considerable success in object classification. Most of these methods focus on designing a pooling scheme to aggregate CNN features of multi-view images into a single compact one. However, these view-wise pooling techniques suffer from loss of visual information. To deal with this issue, an adaptive rank pooling layer is introduced in this paper. Unlike max-pooling which only considers the maximum or mean-pooling that treats each element indiscriminately, the proposed pooling layer takes all the elements into account and dynamically adjusts their importances during the training. Experiments conducted on ModelNet40 and ModelNet10 shows both efficiency and accuracy gain when inserting such a layer into a baseline CNN architecture.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115030850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Spatio-temporal Hybrid Network for Action Recognition 一个用于动作识别的时空混合网络
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965878
Song Li, Zhicheng Zhao, Fei Su
{"title":"A Spatio-temporal Hybrid Network for Action Recognition","authors":"Song Li, Zhicheng Zhao, Fei Su","doi":"10.1109/VCIP47243.2019.8965878","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965878","url":null,"abstract":"Convolutional Neural Networks (CNNs) are powerful in learning spatial information for static images, while they appear to lose their abilities for action recognition in videos because of the neglecting of long-term motion information. Traditional 3D convolution has high computation complexity and the used Global Average Pooling (GAP) on the bottom of network can also lead to unwanted content loss or distortion. To address above problems, we propose a novel action recognition algorithm by effectively fusing 2D and Pseudo-3D CNN to learn spatio-temporal features of video. First, we use Pseudo-3D CNN with proposed Multi-level pooling module to learn spatio-temporal features. Second, the features output by multi-level pooling module are passed through our proposed processing module to make full use of the rich features. Third, a 2D CNN fed with motion vectors is designed to extract motion patterns, which can be regarded as a supplement of Pseudo-3D CNN to make up for the information lost by RGB images. Fourth, a dependency-based fusion method is proposed to fuse the multi-stream features. Finally, the effectiveness of our proposed action recognition algorithm is demonstrated on public UCF101 and HMDB51 datasets.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"463 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122559414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Weakly Supervised Learning for Blind Image Quality Assessment 用于盲图像质量评估的弱监督学习
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965868
Weiquan He, Xinbo Gao, Wen Lu, R. Guan
{"title":"Weakly Supervised Learning for Blind Image Quality Assessment","authors":"Weiquan He, Xinbo Gao, Wen Lu, R. Guan","doi":"10.1109/VCIP47243.2019.8965868","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965868","url":null,"abstract":"The blind image quality assessment (BIQA) metric based on deep neural network (DNN) achieves the best evaluation accuracy at present, and the depth of neural networks plays a crucial role for deep learning-based BIQA metric. However, training a DNN for quality assessment is known to be hard because of the lack of labeled data, and getting quality labels for a large number of images is very time consuming and costly. Therefore, training a deep BIQA metric directly will lead to over-fitting in all likelihood. In order to solve this problem, we introduced a weakly supervised approach for learning a deep BIQA metric. First, we pre-trained a novel encoder-decoder architecture by using the training data with weak quality annotations. The annotation is the error map between the distorted image and its undistorted version, which can roughly describes the distribution of distortion and can be easily acquired for training. Next, we fine-tuned the pre-trained encoder on the quality labeled data set. Moreover, we used the group convolution to reduce the parameters of the proposed metric and further reduce the risk of over-fitting. These training strategies, which reducing the risk of over-fitting, enable us to build a very deep neural network for BIQA to have a better performance. Experimental results showed that the proposed model had the state-of-art performance for various images with different distortion types.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122444355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Visualization of Dynamic Resource Allocation for HEVC Encoding in FPGA-Accelerated SDN Cloud fpga加速SDN云中HEVC编码动态资源分配可视化
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8966042
Panu Sjövall, Mikko Teuho, Arto Oinonen, Jarno Vanne, T. Hämäläinen
{"title":"Visualization of Dynamic Resource Allocation for HEVC Encoding in FPGA-Accelerated SDN Cloud","authors":"Panu Sjövall, Mikko Teuho, Arto Oinonen, Jarno Vanne, T. Hämäläinen","doi":"10.1109/VCIP47243.2019.8966042","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966042","url":null,"abstract":"This paper describes a demonstration setup to visualize dynamic resource allocation for real-time HEVC encoding services in FPGA-accelerated cloud. The demonstrated application is Kvazaar HEVC intra encoder, whose functionality is partitioned between FPGAs and processors. During the demonstration, several encoding services can be invoked with requests to the resource manager, which is responsible for allocation, deallocation, and load balancing of resources in the network. The manager provides JSON data to the visualizer, which uses D3 JavaScript library to visualize 1) the physical network structure; 2) running services; and 3) performance of the network elements. This interactive demonstration allows users to request new video streams, view the encoded streams, observe the visualization of the network and services, and manually turn on/off resources to test the robustness of the system.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122534858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
VCIP 2019 Organizing Committee VCIP 2019组委会
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/vcip47243.2019.8966073
{"title":"VCIP 2019 Organizing Committee","authors":"","doi":"10.1109/vcip47243.2019.8966073","DOIUrl":"https://doi.org/10.1109/vcip47243.2019.8966073","url":null,"abstract":"","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114457500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comer-Line-Prediction based Water-tank Detection and Localization 基于角线预测的水箱检测与定位
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965977
Hao Chen, Chongyang Zhang, Yan Luo, Bingkun Zhao, Jiahao Bao
{"title":"Comer-Line-Prediction based Water-tank Detection and Localization","authors":"Hao Chen, Chongyang Zhang, Yan Luo, Bingkun Zhao, Jiahao Bao","doi":"10.1109/VCIP47243.2019.8965977","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965977","url":null,"abstract":"Water tanks on the roof of buildings require regular labor-costing inspection, and object detection can be used to automate the task. Current detection frameworks have several drawbacks when they are applied: (1) The output horizontal rectangular boxes cannot provide arbitrary quadrilateral detection representations; (2) False positive results may easily appear when key-point based models are used. In this paper, we propose a novel detection framework: Corner-Line-Prediction, which generates tight quadrilateral detection results of the tank blocks. Our model is built on key point detection network to detect corner points precisely. And an original line predictor is integrated to recognize unique tank edges, such that numerous false positive detections can be suppressed. Experimental results show that our Corner-Line-Prediction (CLP) framework outperforms state- of-the-art detection algorithms in average-precision (AP) and produces better localization results, compared with mainstream general detection models.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"44 19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122040550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Part-guided Network for Pedestrian Attribute Recognition 行人属性识别的部分引导网络
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965957
Ha-eun An, Haonan Fan, Kaiwen Deng, Hai-Miao Hu
{"title":"Part-guided Network for Pedestrian Attribute Recognition","authors":"Ha-eun An, Haonan Fan, Kaiwen Deng, Hai-Miao Hu","doi":"10.1109/VCIP47243.2019.8965957","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965957","url":null,"abstract":"Pedestrian attribute recognition, which can benefit other tasks such as person re-identification and pedestrian retrieval, is very important in video surveillance related tasks. In this paper, we observe that the existing methods tackle this problem from the perspective of multi-label classification without considering the spatial location constraints, which means that the attributes tend to be recognized at certain body parts. Based on that, we propose a novel Part-guided Network (P-Net), which guides the refined convolutional feature maps to capture different location information for the attributes related to different body parts. The part-guided attention module employs the pix-level classification to produce attention maps which can be interpreted as the probability of each pixel belonging to the 6 pre-defined body parts. Experimental results demonstrate that the proposed network gives superior performances compared to the state-of-the-art techniques.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114086754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Asymmetric Supervised Deep Autoencoder for Depth Image based 3D Model Retrieval 基于深度图像的三维模型检索的非对称监督深度自编码器
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8965682
A. Siddiqua, Guoliang Fan
{"title":"Asymmetric Supervised Deep Autoencoder for Depth Image based 3D Model Retrieval","authors":"A. Siddiqua, Guoliang Fan","doi":"10.1109/VCIP47243.2019.8965682","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8965682","url":null,"abstract":"In this paper, we propose a new asymmetric supervised deep autoencoder approach to retrieve 3D shapes based on depth images. The asymmetric supervised autoencoder is trained with real and synthetic depth images together. The novelty of this research lies in the asymmetric structure of a supervised deep autoencoder. The proposed asymmetric deep supervised autoencoder deals with the incompleteness and ambiguity present in the depth images by balancing reconstruction and classification capabilities in a unified way with mixed depth images. We investigate the relationship between the encoder layers and decoder layers, and claim that an asymmetric structure of a supervised deep autoencoder reduces the chance of overfitting by 8% and is capable of extracting more robust features with respect to the variance of input than that of a symmetric structure. The experimental results on the NYUD2 and ModelNet10 datasets demonstrate that the proposed supervised method outperforms the recent approaches for cross modal 3D model retrieval.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"661 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116487486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Comparative Convolutional Neural Network for Younger Face Identification 比较卷积神经网络在年轻人脸识别中的应用
2019 IEEE Visual Communications and Image Processing (VCIP) Pub Date : 2019-12-01 DOI: 10.1109/VCIP47243.2019.8966026
Liangliang Wang, D. Rajan
{"title":"Comparative Convolutional Neural Network for Younger Face Identification","authors":"Liangliang Wang, D. Rajan","doi":"10.1109/VCIP47243.2019.8966026","DOIUrl":"https://doi.org/10.1109/VCIP47243.2019.8966026","url":null,"abstract":"We consider the problem of determining whether a pair of face images can be distinguishable in terms of age and if so, which is the younger of the two. We also determine the degree of distinguishability in which age differences are categorized into large, medium, small and tiny. We propose a comparative convolutional neural network combining two parallel deep architectures. Based on the two deep learnt face features, we introduce a comparative layer to represent their mutual relationships, followed by a concatenatation implementation. Softmax is adopted to complete the classification task. To demonstrate our approach, we construct a very large dataset consisting of over 1.7 million face image pairs with young/old labels.","PeriodicalId":388109,"journal":{"name":"2019 IEEE Visual Communications and Image Processing (VCIP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121539094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信