2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)最新文献

筛选
英文 中文
A Comparison of Deep Learning Methods for Semantic Segmentation of Coral Reef Survey Images 珊瑚礁测量图像语义分割的深度学习方法比较
A. King, S. Bhandarkar, B. Hopkinson
{"title":"A Comparison of Deep Learning Methods for Semantic Segmentation of Coral Reef Survey Images","authors":"A. King, S. Bhandarkar, B. Hopkinson","doi":"10.1109/CVPRW.2018.00188","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00188","url":null,"abstract":"Two major deep learning methods for semantic segmentation, i.e., patch-based convolutional neural network (CNN) approaches and fully convolutional neural network (FCNN) models, are studied in the context of classification of regions in underwater images of coral reef ecosystems into biologically meaningful categories. For the patch-based CNN approaches, we use image data extracted from underwater video accompanied by individual point-wise ground truth annotations. We show that patch-based CNN methods can outperform a previously proposed approach that uses support vector machine (SVM)-based classifiers in conjunction with texture-based features. We compare the results of five different CNN architectures in our formulation of patch-based CNN methods. The Resnet152 CNN architecture is observed to perform the best on our annotated dataset of underwater coral reef images. We also examine and compare the results of four different FCNN models for semantic segmentation of coral reef images. We develop a tool for fast generation of segmentation maps to serve as ground truth segmentations for our FCNN models. The FCNN architecture Deeplab v2 is observed to yield the best results for semantic segmentation of underwater coral reef images.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132510769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Highway Network Block with Gates Constraints for Training Very Deep Networks 用于训练极深网络的带门约束的公路网块
O. Oyedotun, Abd El Rahman Shabayek, Djamila Aouada, B. Ottersten
{"title":"Highway Network Block with Gates Constraints for Training Very Deep Networks","authors":"O. Oyedotun, Abd El Rahman Shabayek, Djamila Aouada, B. Ottersten","doi":"10.1109/CVPRW.2018.00217","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00217","url":null,"abstract":"In this paper, we propose to reformulate the learning of the highway network block to realize both early optimization and improved generalization of very deep networks while preserving the network depth. Gate constraints are duly employed to improve optimization, latent representations and parameterization usage in order to efficiently learn hierarchical feature transformations which are crucial for the success of any deep network. One of the earliest very deep models with over 30 layers that was successfully trained relied on highway network blocks. Although, highway blocks suffice for alleviating optimization problem via improved information flow, we show for the first time that further in training such highway blocks may result into learning mostly untransformed features and therefore a reduction in the effective depth of the model; this could negatively impact model generalization performance. Using the proposed approach, 15-layer and 20-layer models are successfully trained with one gate and a 32-layer model using three gates. This leads to a drastic reduction of model parameters as compared to the original highway network. Extensive experiments on CIFAR-10, CIFAR-100, Fashion-MNIST and USPS datasets are performed to validate the effectiveness of the proposed approach. Particularly, we outperform the original highway network and many state-of-the-art results. To the best our knowledge, on the Fashion-MNIST and USPS datasets, the achieved results are the best reported in literature.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116435494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Fully End-to-End Learning Based Conditional Boundary Equilibrium GAN with Receptive Field Sizes Enlarged for Single Ultra-High Resolution Image Dehazing 基于完全端到端学习的放大感受野的条件边界平衡GAN单幅超高分辨率图像去雾
Sehwan Ki, Hyeonjun Sim, Jae-Seok Choi, Saehun Kim, Munchurl Kim
{"title":"Fully End-to-End Learning Based Conditional Boundary Equilibrium GAN with Receptive Field Sizes Enlarged for Single Ultra-High Resolution Image Dehazing","authors":"Sehwan Ki, Hyeonjun Sim, Jae-Seok Choi, Saehun Kim, Munchurl Kim","doi":"10.1109/CVPRW.2018.00126","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00126","url":null,"abstract":"A receptive field is defined as the region in an input image space that an output image pixel is looking at. Thus, the receptive field size influences the learning of deep convolution neural networks. Especially, in single image dehazing problems, larger receptive fields often show more effective dehazying by considering the brightness and color of the entire input hazy image without additional information (e.g. scene transmission map, depth map, and atmospheric light). The conventional generative adversarial network (GAN) with small-sized receptive fields cannot be effective for hazy images of ultra-high resolution. Thus, we proposed a fully end-to-end learning based conditional boundary equilibrium generative adversarial network (BEGAN) with the receptive field sizes enlarged for single image dehazing. In our conditional BEGAN, its discriminator is trained ultra-high resolution conditioned on downscale input hazy images, so that the haze can effectively be removed with the original structures of images stably preserved. From this, we can obtain the high PSNR performance (Track 1 - Indoor: top 4th-ranked) and fast computation speeds. Also, we combine an L1 loss, a perceptual loss and a GAN loss as the generator's loss of the proposed conditional BEGAN, which allows to obtain stable dehazing results for various hazy images.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131920850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Learning Instance Segmentation by Interaction 通过交互学习实例分割
Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, S. Levine, Jitendra Malik
{"title":"Learning Instance Segmentation by Interaction","authors":"Deepak Pathak, Yide Shentu, Dian Chen, Pulkit Agrawal, Trevor Darrell, S. Levine, Jitendra Malik","doi":"10.1109/CVPRW.2018.00276","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00276","url":null,"abstract":"Objects are a fundamental component of visual perception. How are humans able to effortlessly reorganize their visual observations into a discrete set of objects is a question that has puzzled researchers for centuries. The Gestalt school of thought put forth the proposition that humans use similarity in color, texture and motion to group pixels into individual objects [21]. Various methods for object segmentation based on color and texture cues have been proposed [3, 6, 7, 14, 16]. These approaches are, however, known to over-segment multi-colored and textured objects.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128582181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
A Holistic Framework for Addressing the World Using Machine Learning 使用机器学习解决世界问题的整体框架
Ilke Demir
{"title":"A Holistic Framework for Addressing the World Using Machine Learning","authors":"Ilke Demir","doi":"10.1109/CVPRW.2018.00245","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00245","url":null,"abstract":"Millions of people are disconnected from basic services due to lack of adequate addressing. We propose an automatic generative algorithm to create street addresses from satellite imagery. Our addressing scheme is coherent with the street topology, linear and hierarchical to follow human perception, and universal to be used as a unified geocoding system. Our algorithm starts with extracting road segments using deep learning and partitions the road network into regions. Then regions, streets, and address cells are named using proximity computations. We also extend our addressing scheme to cover inaccessible areas, to be flexible for changes, and to lead as a pioneer for a unified geodatabase.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134229614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Path Orthogonal Matching Pursuit for Sparse Reconstruction and Denoising of SWIR Maritime Imagery 基于路径正交匹配追踪的SWIR海事图像稀疏重建与去噪
T. Doster, T. Emerson, C. Olson
{"title":"Path Orthogonal Matching Pursuit for Sparse Reconstruction and Denoising of SWIR Maritime Imagery","authors":"T. Doster, T. Emerson, C. Olson","doi":"10.1109/CVPRW.2018.00161","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00161","url":null,"abstract":"We introduce an extension that may be used to augment algorithms used for the sparse decomposition of signals into a linear combination of atoms drawn from a dictionary such as those used in support of, for example, compressive sensing, k-sparse representation, and denoising. Our augmentation may be applied to any reconstruction algorithm that relies on the selection and sorting of high-correlation atoms during an analysis or identification phase by generating a \"path\" between the two highest-correlation atoms. Here we investigate two types of path: a linear combination (Euclidean geodesic) and a construction relying on an optimal transport map (2-Wasserstein geodesic). We test our extension by performing image denoising and k-sparse representation using atoms from a learned overcomplete kSVD dictionary. We study the application of our techniques on SWIR imagery of maritime vessels and show that our methods outperform orthogonal matching pursuit. We conclude that these methods, having shown success in our two tested problem domains, will also be useful for reducing \"basis mismatch\" error that arises in the recovery of compressively sampled images.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"76 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133597654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Road Detection with EOSResUNet and Post Vectorizing Algorithm 基于EOSResUNet和后矢量化算法的道路检测
Oleksandr Filin, Anton Zapara, Serhii Panchenko
{"title":"Road Detection with EOSResUNet and Post Vectorizing Algorithm","authors":"Oleksandr Filin, Anton Zapara, Serhii Panchenko","doi":"10.1109/CVPRW.2018.00036","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00036","url":null,"abstract":"Object recognition on the satellite images is one of the most relevant and popular topics in the problem of pattern recognition. This was facilitated by many factors, such as a high number of satellites with high-resolution imagery, the significant development of computer vision, especially with a major breakthrough in the field of convolutional neural networks, a wide range of industry verticals for usage and still a quite empty market. Roads are one of the most popular objects for recognition. In this article, we want to present you the combination of work of neural network and postprocessing algorithm, due to which we get not only the coverage mask but also the vectors of all of the individual roads that are present in the image and can be used to address the higher-level tasks in the future. This approach was used to solve the DeepGlobe Road Extraction Challenge.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121772625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
[Title page iii] [标题页iii]
{"title":"[Title page iii]","authors":"","doi":"10.1109/cvprw.2018.00002","DOIUrl":"https://doi.org/10.1109/cvprw.2018.00002","url":null,"abstract":"","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121870812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-frame Super Resolution for Ocular Biometrics 用于眼生物识别的多帧超分辨率
N. Reddy, Dewan Fahim Noor, Zhu Li, R. Derakhshani
{"title":"Multi-frame Super Resolution for Ocular Biometrics","authors":"N. Reddy, Dewan Fahim Noor, Zhu Li, R. Derakhshani","doi":"10.1109/CVPRW.2018.00086","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00086","url":null,"abstract":"Some biometrics methods, especially ocular, may use fine spatial information akin to level-3 features. Examples include fine vascular patterns visible in the white of the eyes in green and blue channels, iridial patterns in near infrared, or minute periocular features in visible light. In some mobile applications, an NIR or RGB camera is used to capture these ocular images in a \"selfie\" like manner. However, most of such ocular images captured under unconstrained environments are of lower quality due to spatial resolution, noise, and motion blur, affecting the performance of the ensuing biometric authentication. Here we propose a multi-frame super resolution (MFSR) pipeline to mitigate the problem, where a higher resolution image is generated from multiple lower resolution, noisy and blurry images. We show that the proposed MFSR method at 2× upscaling can improve the equal error rate (EER) by 9.85% compared to single frame bicubic upscaling in RGB ocular matching while being up to 8.5× faster than comparable state-of-the-art MFSR method.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123523878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Scene Grammar in Human and Machine Recognition of Objects and Scenes 场景语法在人和机器识别对象和场景中的应用
Akram Bayat, D. Koh, Anubhaw Kumar Nand, Marta Pereira, M. Pomplun
{"title":"Scene Grammar in Human and Machine Recognition of Objects and Scenes","authors":"Akram Bayat, D. Koh, Anubhaw Kumar Nand, Marta Pereira, M. Pomplun","doi":"10.1109/CVPRW.2018.00268","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00268","url":null,"abstract":"In this paper, we study the effects of violating the high level scene syntactic and semantic rules on human eye-movement behavior and deep neural scene and object recognition networks. An eye-movement experimental study was conducted with twenty human subjects to view scenes from the SCEGRAM image database and determine whether there is an inconsistent object or not. We examine the contribution of multiple types of features that influence eye movements while searching for an inconsistent object in a scene (e.g., size and location of an object) by evaluating the consistency prediction power of the trained classifiers on fixation features. The results of the eye movement analysis and inconsistency prediction reveal that: 1) inconsistent objects are fixated significantly more than consistent objects in a scene, 2) the distribution of fixations is the main factor that is influenced by the inconsistency condition of a scene which is reflected in the ground truth fixation maps. It is also observed that the performance of deep object and scene recognition networks drops due to the violations of scene grammar. The class-specific visual saliency maps are created from the high-level representation of the convolutional layers of a deep network during the scene and object recognition process. We discuss whether the scene inconsistencies are represented in those saliency maps by evaluating their prediction powers using multiple well-known metrics including AUC, SIM, and KL. The results suggest that an inconsistent object in a scene causes significant variations in the prediction power of saliency maps.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123553976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信