2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

筛选
英文 中文
A Comprehensive Solution for Deep-Learning Based Cargo Inspection to Discriminate Goods in Containers 一种基于深度学习的集装箱货物识别综合解决方案
Jiahang Che, Yuxiang Xing, Li Zhang
{"title":"A Comprehensive Solution for Deep-Learning Based Cargo Inspection to Discriminate Goods in Containers","authors":"Jiahang Che, Yuxiang Xing, Li Zhang","doi":"10.1109/CVPRW.2018.00166","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00166","url":null,"abstract":"In this work, we attempt to classify commodities in containers with HS(harmonized system) codes, which is a challenging task due to the large number of categories in HS codes and its hierarchical structure based on a product's composition and economic activity. To tackle this problem, in this paper we propose an ensemble model which incorporates fine-grained image categorization, data analysis on cargo manifests, and human-in-the-loop paradigm. By employing deep learning, we train a triplet network for fine-grained image categorization. Then, by investigating massive information from cargo manifests, unreasonable predictions can be filtered out. With human-in-the-loop embedded, human intelligence is integrated to justify the resulted HS codes. Moreover, a HS code semantic tree is built to trade off specificity and accuracy.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133569371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Analysis of Efficient CNN Design Techniques for Semantic Segmentation 语义分割的高效CNN设计技术分析
Alexandre Briot, P. Viswanath, S. Yogamani
{"title":"Analysis of Efficient CNN Design Techniques for Semantic Segmentation","authors":"Alexandre Briot, P. Viswanath, S. Yogamani","doi":"10.1109/CVPRW.2018.00109","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00109","url":null,"abstract":"Majority of CNN architecture design is aimed at achieving high accuracy in public benchmarks by increasing the complexity. Typically, they are over-specified by a large margin and can be optimized by a factor of 10-100x with only a small reduction in accuracy. In spite of the increase in computational power of embedded systems, these networks are still not suitable for embedded deployment. There is a large need to optimize for hardware and reduce the size of the network by orders of magnitude for computer vision applications. This has led to a growing community which is focused on designing efficient networks. However, CNN architectures are evolving rapidly and efficient architectures seem to lag behind. There is also a gap in understanding the hardware architecture details and incorporating it into the network design. The motivation of this paper is to systematically summarize efficient design techniques and provide guidelines for an application developer. We also perform a case study by benchmarking various semantic segmentation algorithms for autonomous driving.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127813954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Realtime Quality Assessment of Iris Biometrics Under Visible Light 可见光下虹膜生物识别的实时质量评估
Mohsen Jenadeleh, Marius Pedersen, D. Saupe
{"title":"Realtime Quality Assessment of Iris Biometrics Under Visible Light","authors":"Mohsen Jenadeleh, Marius Pedersen, D. Saupe","doi":"10.1109/CVPRW.2018.00085","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00085","url":null,"abstract":"Ensuring sufficient quality of iris images acquired by handheld imaging devices in visible light poses many challenges to iris recognition systems. Many distortions affect the input iris images, and the source and types of these distortions are unknown in uncontrolled environments. We propose a fast no-reference image quality assessment measure for predicting iris image quality to handle severely degraded iris images. The proposed differential sign-magnitude statistics index (DSMI) is based on statistical features of the local difference sign-magnitude transform, which are computed by comparing the local mean with the central pixel of the patch and considering the noticeable variations. The experiments, conducted with a reference iris recognition system and three visible light datasets, showed that the quality of iris images strongly affects the recognition performance. Using the proposed method as a quality filtering step improved the performance of the iris recognition system by rejecting poor quality iris samples.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128006406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data 从自然数据中学习特定类重构的层次模型
Arun C. S. Kumar, S. Bhandarkar, Mukta Prasad
{"title":"Learning Hierarchical Models for Class-Specific Reconstruction from Natural Data","authors":"Arun C. S. Kumar, S. Bhandarkar, Mukta Prasad","doi":"10.1109/CVPRW.2018.00153","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00153","url":null,"abstract":"We propose a novel method for class-specific, single-view, object detection, pose estimation and deformable 3D reconstruction, where a two-pronged (sparse semantic and dense shape) representation is learned from natural image data automatically. Then, given a new image, it can estimate camera pose and deformable reconstruction using an effective, incremental optimization. Our method extracts a continuous, scaled-orthographic pose (without resorting to regression and/or discretized 1D azimuth-based representations). The method reconstructs a full free-form shape (rather than retrieving the closest 3D CAD shape proxy, typical in state-of-the-art). We learn our two-pronged model purely from natural image data, as automatically and faithfully as possible, reducing the human effort and bias typical to this problem. The pipeline combines data-driven deep learning based semantic part learning with principled modelling and effective optimization of the problem's physics, shape deformation, pose and occlusion. The underlying sparse (part-based) representation of the object is computationally efficient for purposes like detection and discriminative tasks, whereas the overlaid dense (skin like) representation, models and realistically renders comprehensive 3D structure including natural deformation, occlusion. The results for the car class are visually pleasing, and importantly, outperform the state-of-the-art quantitatively too. Our contribution to visual scene understanding through the two-pronged object representation shows promise for more accurate 3D scene understanding for real world applications on virtual/mixed reality, autonomous navigation, to cite a few.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132696040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAM: Pushing the Limits of Saliency Prediction Models SAM:推动显著性预测模型的极限
M. Cornia, L. Baraldi, G. Serra, R. Cucchiara
{"title":"SAM: Pushing the Limits of Saliency Prediction Models","authors":"M. Cornia, L. Baraldi, G. Serra, R. Cucchiara","doi":"10.1109/CVPRW.2018.00250","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00250","url":null,"abstract":"The prediction of human eye fixations has been recently gaining a lot of attention thanks to the improvements shown by deep architectures. In our work, we go beyond classical feed-forward networks to predict saliency maps and propose a Saliency Attentive Model which incorporates neural attention mechanisms to iteratively refine predictions. Experiments demonstrate that the proposed strategy overcomes by a considerable margin the state of the art on the largest dataset available for saliency prediction. Here, we provide experimental results on other popular saliency datasets to confirm the effectiveness and the generalization capabilities of our model, which enable us to reach the state of the art on all considered datasets.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"519 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134066807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A Comparative Study of Real-Time Semantic Segmentation for Autonomous Driving 自动驾驶实时语义分割的比较研究
Mennatullah Siam, M. Gamal, Moemen Abdel-Razek, S. Yogamani, Martin Jägersand, Hong Zhang
{"title":"A Comparative Study of Real-Time Semantic Segmentation for Autonomous Driving","authors":"Mennatullah Siam, M. Gamal, Moemen Abdel-Razek, S. Yogamani, Martin Jägersand, Hong Zhang","doi":"10.1109/CVPRW.2018.00101","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00101","url":null,"abstract":"Semantic segmentation is a critical module in robotics related applications, especially autonomous driving. Most of the research on semantic segmentation is focused on improving the accuracy with less attention paid to computationally efficient solutions. Majority of the efficient semantic segmentation algorithms have customized optimizations without scalability and there is no systematic way to compare them. In this paper, we present a real-time segmentation benchmarking framework and study various segmentation algorithms for autonomous driving. We implemented a generic meta-architecture via a decoupled design where different types of encoders and decoders can be plugged in independently. We provide several example encoders including VGG16, Resnet18, MobileNet, and ShuffleNet and decoders including SkipNet, UNet and Dilation Frontend. The framework is scalable for addition of new encoders and decoders developed in the community for other vision tasks. We performed detailed experimental analysis on cityscapes dataset for various combinations of encoder and decoder. The modular framework enabled rapid prototyping of a custom efficient architecture which provides ~x143 GFLOPs reduction compared to SegNet and runs real-time at ~15 fps on NVIDIA Jetson TX2. The source code of the framework is publicly available.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124221641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 117
Building Detection from Satellite Imagery Using a Composite Loss Function 基于复合损失函数的卫星图像建筑物检测
Sergey Golovanov, R. Kurbanov, A. Artamonov, A. Davydow, S. Nikolenko
{"title":"Building Detection from Satellite Imagery Using a Composite Loss Function","authors":"Sergey Golovanov, R. Kurbanov, A. Artamonov, A. Davydow, S. Nikolenko","doi":"10.1109/CVPRW.2018.00040","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00040","url":null,"abstract":"In this paper, we present a LinkNet-based architecture with SE-ResNeXt-50 encoder and a novel training strategy that strongly relies on image preprocessing and incorporating distorted network outputs. The architecture combines a pre-trained convolutional encoder and a symmetric expanding path that enables precise localization. We show that such a network can be trained on plain RGB images with a composite loss function and achieves competitive results on the DeepGlobe challenge on building extraction from satellite images","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125038233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Video Based Measurement of Heart Rate and Heart Rate Variability Spectrogram from Estimated Hemoglobin Information 基于视频的测量心率和心率变异性频谱图估计血红蛋白信息
Munenori Fukunishi, Kouki Kurita, Shoji Yamamoto, N. Tsumura
{"title":"Video Based Measurement of Heart Rate and Heart Rate Variability Spectrogram from Estimated Hemoglobin Information","authors":"Munenori Fukunishi, Kouki Kurita, Shoji Yamamoto, N. Tsumura","doi":"10.1109/CVPRW.2018.00180","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00180","url":null,"abstract":"In this paper, we propose an accurate remote observation of the heart rate (HR) and heart rate variability (HRV) based on extracted hemoglobin information which is based on detail skin optics model. We perform experiments to measure subjects at rest and under cognitive stress with the proposed method putting a polarized filter in front of camera to evaluate the principal of the framework. From the results of the experiments, the proposed method shows a high correlation with the electrocardiograph (ECG) which is assumed as the ground truth. We also evaluated the robustness against illumination change in simulation. We confirmed that the proposed method could obtain accurate BVP detection compared with other conventional methods since the proposed method eliminates the shading component through the process of the extraction of hemoglobin component.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131536620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Deep Super Resolution for Recovering Physiological Information from Videos 从视频中恢复生理信息的深度超分辨率
Daniel J. McDuff
{"title":"Deep Super Resolution for Recovering Physiological Information from Videos","authors":"Daniel J. McDuff","doi":"10.1109/CVPRW.2018.00185","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00185","url":null,"abstract":"Imaging photoplethysmography (iPPG) allows for remote measurement of vital signs from the human skin. In some applications the skin region of interest may only occupy a small number of pixels (e.g., if an individual is a large distance from the imager.) We present a novel pipeline for iPPG using an image super-resolution preprocessing step that can reduce the mean absolute error in heart rate prediction by over 30%. Furthermore, deep learning-based image super-resolution outperforms standard interpolation methods. Our method can be used in conjunction with any existing iPPG algorithm to estimate physiological parameters. It is particularly promising for analysis of low resolution and spatially compressed videos, where otherwise the pulse signal would be too weak.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129140882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Persistent Memory Residual Network for Single Image Super Resolution 单图像超分辨率持久记忆残余网络
Rongzhen Chen, Yanyun Qu, Kun Zeng, Jinkang Guo, Cuihua Li, Yuan Xie
{"title":"Persistent Memory Residual Network for Single Image Super Resolution","authors":"Rongzhen Chen, Yanyun Qu, Kun Zeng, Jinkang Guo, Cuihua Li, Yuan Xie","doi":"10.1109/CVPRW.2018.00125","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00125","url":null,"abstract":"Progresses has been witnessed in single image superresolution in which the low-resolution images are simulated by bicubic downsampling. However, for the complex image degradation in the wild such as downsampling, blurring, noises, and geometric deformation, the existing superresolution methods do not work well. Inspired by a persistent memory network which has been proven to be effective in image restoration, we implement the core idea of human memory on the deep residual convolutional neural network. Two types of memory blocks are designed for the NTIRE2018 challenge. We embed the two types of memory blocks in the framework of enhanced super resolution network (EDSR), which is the NTIRE2017 champion method. The residual blocks of EDSR is replaced by two types of memory blocks. The first type of memory block is a residual module, and one memory block contains four residual modules with four residual blocks followed by a gate unit, which adaptively selects the features needed to store. The second type of memory block is a residual dilated convolutional block, which contains seven dilated convolution layers linked to a gate unit. The two proposed models not only improve the super-resolution performance but also mitigate the image degradation of noises and blurring. Experimental results on the DIV2K dataset demonstrate our models achieve better performance than EDSR.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132330248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信