H. Nagahara, Toshiki Sonoda, Dengyu Liu, Jinwei Gu
{"title":"Space-Time-Brightness Sampling Using an Adaptive Pixel-Wise Coded Exposure","authors":"H. Nagahara, Toshiki Sonoda, Dengyu Liu, Jinwei Gu","doi":"10.1109/CVPRW.2018.00237","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00237","url":null,"abstract":"Most conventional digital video cameras face a fundamental trade-off between spatial resolution, temporal resolution and dynamic range (i.e., brightness resolution) because of a limited bandwidth for data transmission. A few recent studies have shown that with non-uniform space-time sampling, such as that implemented with pixel-wise coded exposure, one can go beyond this trade-off and achieve high efficiency for scene capture. However, in these studies, the sampling schemes were pre-defined and independent of the target scene content. In this paper, we propose an adaptive space-time-brightness sampling method to further improve the efficiency of video capture. The proposed method adaptively updates a pixel-wise coded exposure pattern using the information analyzed from previously captured frames. We built a prototype camera that enables adaptive coding of patterns online to show the feasibility of the proposed adaptive coded exposure method. Simulation and experimental results show that the adaptive space-time-brightness sampling scheme achieves more accurate video reconstruction results and high dynamic range with less computational cost, than previous method. To the best of our knowledge, our prototype is the first implementation of an adaptive pixel-wise coded exposure camera.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128632636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Eli Alshan, Sharon Alpert, A. Neuberger, N. Bubis, Eduard Oks
{"title":"Learning Fashion By Simulated Human Supervision","authors":"Eli Alshan, Sharon Alpert, A. Neuberger, N. Bubis, Eduard Oks","doi":"10.1109/CVPRW.2018.00310","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00310","url":null,"abstract":"We consider the task of predicting subjective fashion traits from images using neural networks. Specifically, we are interested in training a network for ranking outfits according to how well they fit the user. In order to capture the variability induced by human subjective considerations, each training example is annotated by a panel of fashion experts. Similarly to previous works on subjective data, the panel votes are converted to a classification or regression problem and the corresponding network is trained and evaluated using standard objective metrics. The question is which objective metric, if any, is most suitable to measure the performance of a network trained for subjective tasks? In this paper, we conducted human approval tests for outfit ranking networks trained using various objective metrics. We show that these metrics do not adequately estimate the human approval of subjective tasks. Instead, we introduce a supervising network that unlike objective metrics, is designed to capture the variability induced by human subjectivity. We use it to supervise our outfit ranking network and we demonstrate empirically, that training our outfit ranking network with the suggested supervising network achieves greater approval ratings from human subjects.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"2017 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129122075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle Re-identification with the Space-Time Prior","authors":"Chih-Wei Wu, Chih-Ting Liu, Cheng-En Chiang, Wei-Chih Tu, Shao-Yi Chien","doi":"10.1109/CVPRW.2018.00024","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00024","url":null,"abstract":"Vehicle re-identification (Re-ID) is fundamentally challenging due to the difficulties in data labeling, visual domain mismatch between datasets and diverse appearance of the same vehicle. We propose the adaptive feature learning technique based on the space-time prior to address these issues. The idea is demonstrated effectively in both the human Re-ID and the vehicle Re-ID tasks. We train a vehicle feature extractor in a multi-task learning manner on three existing vehicle datasets and fine-tune the feature extractor with the adaptive feature learning technique on the target domain. We then develop a vehicle Re-ID system based on the learned vehicle feature extractor. Finally, our meticulous system design leads to the second place in the 2018 NVIDIA AI City Challenge Track 3.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116899854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Geodesic Discriminant Analysis for Manifold-Valued Data","authors":"M. Louis, B. Charlier, S. Durrleman","doi":"10.1109/CVPRW.2018.00073","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00073","url":null,"abstract":"In many statistical settings, it is assumed that high-dimensional data actually lies on a low-dimensional manifold. In this perspective, there is a need to generalize statistical methods to nonlinear spaces. To that end, we propose generalizations of the Linear Discriminant Analysis (LDA) to manifolds. First, we generalize the reduced rank LDA solution by constructing a geodesic subspace which optimizes a criterion equivalent to Fisher's discriminant in the linear case. Second, we generalize the LDA formulated as a restricted Gaussian classifier. The generalizations of those two methods, which are equivalent in the linear case, are in general different in the manifold case. We illustrate the first generalization on the sphere S^2. Then, we propose applications using the Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, in which we rephrase the second generalization. We perform dimension reduction and classification on the kimia-216 dataset and on a set of 3D brain structures segmented from Alzheimer's disease and control subjects, recovering state-of-the-art performances.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117258580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deborah Levy, Yuval Belfer, Elad Osherov, Eyal Bigal, A. Scheinin, Hagai Nativ, D. Tchernov, T. Treibitz
{"title":"Automated Analysis of Marine Video with Limited Data","authors":"Deborah Levy, Yuval Belfer, Elad Osherov, Eyal Bigal, A. Scheinin, Hagai Nativ, D. Tchernov, T. Treibitz","doi":"10.1109/CVPRW.2018.00187","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00187","url":null,"abstract":"Monitoring of the marine environment requires large amounts of data, simply due to its vast size. Therefore, underwater autonomous vehicles and drones are increasingly deployed to acquire numerous photographs. However, ecological conclusions from them are lagging as the data requires expert annotation and thus realistically cannot be manually processed. This calls for developing automatic classification algorithms dedicated for this type of data. Current out-of-the-box solutions struggle to provide optimal results in these scenarios as the marine data is very different from everyday data. Images taken under water display low contrast levels and reduced visibility range thus making objects harder to localize and classify. Scale varies dramatically because of the complex 3 dimensionality of the scenes. In addition, the scarcity of labeled marine data prevents training these dedicated networks from scratch. In this work, we demonstrate how transfer learning can be utilized to achieve high quality results for both detection and classification in the marine environment. We also demonstrate tracking in videos that enables counting and measuring the organisms. We demonstrate the suggested method on two very different marine datasets, an aerial dataset and an underwater one.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114392618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anthony Ortiz, Alonso Granados, O. Fuentes, Christopher Kiekintveld, D. Rosario, Zachary Bell
{"title":"Integrated Learning and Feature Selection for Deep Neural Networks in Multispectral Images","authors":"Anthony Ortiz, Alonso Granados, O. Fuentes, Christopher Kiekintveld, D. Rosario, Zachary Bell","doi":"10.1109/CVPRW.2018.00165","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00165","url":null,"abstract":"The curse of dimensionality is a well-known phenomenon that arises when applying machine learning algorithms to highly-dimensional data; it degrades performance as a function of increasing dimension. Due to the high data dimensionality of multispectral and hyperspectral imagery, classifiers trained on limited samples with many spectral bands tend to overfit, leading to weak generalization capability. In this work, we propose an end-to-end framework to effectively integrate input feature selection into the training procedure of a deep neural network for dimensionality reduction. We show that Integrated Learning and Feature Selection (ILFS) significantly improves performance on neural networks for multispectral imagery applications. We also evaluate the proposed methodology as a potential defense against adversarial examples, which are malicious inputs carefully designed to fool a machine learning system. Our experimental results show that methods for generating adversarial examples designed for RGB space are also effective for multispectral imagery and that ILFS significantly mitigates their effect.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"316 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115149170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Xu, Xi Ouyang, Yu Cheng, Shining Yu, Lin Xiong, Choon-Ching Ng, Sugiri Pranata, Shengmei Shen, Junliang Xing
{"title":"Dual-Mode Vehicle Motion Pattern Learning for High Performance Road Traffic Anomaly Detection","authors":"Yan Xu, Xi Ouyang, Yu Cheng, Shining Yu, Lin Xiong, Choon-Ching Ng, Sugiri Pranata, Shengmei Shen, Junliang Xing","doi":"10.1109/CVPRW.2018.00027","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00027","url":null,"abstract":"Anomaly detection on road traffic is an important task due to its great potential in urban traffic management and road safety. It is also a very challenging task since the abnormal event happens very rarely and exhibits different behaviors. In this work, we present a model to detect anomaly in road traffic by learning from the vehicle motion patterns in two distinctive yet correlated modes, i.e., the static mode and the dynamic mode, of the vehicles. The static mode analysis of the vehicles is learned from the background modeling followed by vehicle detection procedure to find the abnormal vehicles that keep still on the road. The dynamic mode analysis of the vehicles is learned from detected and tracked vehicle trajectories to find the abnormal trajectory which is aberrant from the dominant motion patterns. The results from the dual-mode analyses are finally fused together by driven a re-identification model to obtain the final anomaly. Experimental results on the Track 2 testing set of NVIDIA AI CITY CHALLENGE show the effectiveness of the proposed dual-mode learning model and its robustness in different real scenes. Our result ranks the first place on the final Leaderboard of the Track 2.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115313034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Eye in the Sky: Real-Time Drone Surveillance System (DSS) for Violent Individuals Identification Using ScatterNet Hybrid Deep Learning Network","authors":"Amarjot Singh, D. Patil, SN Omkar","doi":"10.1109/CVPRW.2018.00214","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00214","url":null,"abstract":"Drone systems have been deployed by various law enforcement agencies to monitor hostiles, spy on foreign drug cartels, conduct border control operations, etc. This paper introduces a real-time drone surveillance system to identify violent individuals in public areas. The system first uses the Feature Pyramid Network to detect humans from aerial images. The image region with the human is used by the proposed ScatterNet Hybrid Deep Learning (SHDL) network for human pose estimation. The orientations between the limbs of the estimated pose are next used to identify the violent individuals. The proposed deep network can learn meaningful representations quickly using ScatterNet and structural priors with relatively fewer labeled examples. The system detects the violent individuals in real-time by processing the drone images in the cloud. This research also introduces the aerial violent individual dataset used for training the deep network which hopefully may encourage researchers interested in using deep learning for aerial surveillance. The pose estimation and violent individuals identification performance is compared with the state-of-the-art techniques.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116233520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Video Analytics in Smart Transportation for the AIC’18 Challenge","authors":"Ming-Ching Chang, Yi Wei, Nenghui Song, Siwei Lyu","doi":"10.1109/CVPRW.2018.00016","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00016","url":null,"abstract":"With the fast advancements of AICity and omnipresent street cameras, smart transportation can benefit greatly from actionable insights derived from video analytics. We participate the NVIDIA AICity Challenge 2018 in all three tracks of challenges. In Track 1 challenge, we demonstrate automatic traffic flow analysis using the detection and tracking of vehicles with robust speed estimation. In Track 2 challenge, we develop a reliable anomaly detection pipeline that can recognize abnormal incidences including stalled vehicles and crashes with precise locations and time segments. In Track 3 challenge, we present an early result of vehicle re-identification using deep triplet-loss features that matches vehicles across 4 cameras in 15+ hours of videos. All developed methods are evaluated and compared against 30 contesting methods from 70 registered teams on the real-world challenge videos.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127147294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IFQ-Net: Integrated Fixed-Point Quantization Networks for Embedded Vision","authors":"Hongxing Gao, Wei Tao, Dongchao Wen, Tse-Wei Chen, Kinya Osa, Masami Kato","doi":"10.1109/CVPRW.2018.00103","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00103","url":null,"abstract":"Deploying deep models on embedded devices has been a challenging problem since the great success of deep learning based networks. Fixed-point networks, which represent their data with low bits fixed-point and thus give remarkable savings on memory usage, are generally preferred. Even though current fixed-point networks employ relative low bits (e.g. 8-bits), the memory saving is far from enough for the embedded devices. On the other hand, quantization deep networks, for example XNOR-Net and HWGQ-Net, quantize the data into 1 or 2 bits resulting in more significant memory savings but still contain lots of floating-point data. In this paper, we propose a fixed-point network for embedded vision tasks through converting the floating-point data in a quantization network into fixed-point. Furthermore, to overcome the data loss caused by the conversion, we propose to compose floating-point data operations across multiple layers (e.g. convolution, batch normalization and quantization layers) and convert them into fixed-point. We name the fixed-point network obtained through such integrated conversion as Integrated Fixed-point Quantization Networks (IFQ-Net). We demonstrate that our IFQ-Net gives 2.16× and 18× more savings on model size and runtime feature map memory respectively with similar accuracy on ImageNet. Furthermore, based on YOLOv2, we design IFQ-Tinier-YOLO face detector which is a fixed-point network with 256× reduction in model size (246k Bytes) than Tiny-YOLO. We illustrate the promising performance of our face detector in terms of detection rate on Face Detection Data Set and Bencmark (FDDB) and qualitative results of detecting small faces of Wider Face dataset.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123599993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}