{"title":"Novel Algorithms to Monitor Continuous Cardiac Activity with a Video Camera","authors":"Gregory F. Lewis, Maria I. Davila, S. Porges","doi":"10.1109/CVPRW.2018.00175","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00175","url":null,"abstract":"Recent advances in computer vision methods have made physiological signal extraction from imaging sensors feasible. There is a demand to translate current post-hoc methods into real-time physiological monitoring techniques. Algorithms that function on a single frame of data meet the requirements for continuous, real-time measurement. If these algorithms are computationally efficient they may serve as the basis for an embedded system design that can be integrated within the vision hardware, turning the camera into a physiological monitor. Compelling results are presented derived from an appropriate algorithm for extracting cardiac pulse from sequential, single frames of a color video camera. Results are discussed with respect to physiologically relevant features of variability in beat-to-beat heart rate.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133935613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparison of Deep Transfer Learning Strategies for Digital Pathology","authors":"Romain Mormont, P. Geurts, R. Marée","doi":"10.1109/CVPRW.2018.00303","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00303","url":null,"abstract":"In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-trained neural network architectures and different combination schemes with random forests for feature selection. Our experiments on eight classification datasets show that densely connected and residual networks consistently yield best performances across strategies. It also appears that network fine-tuning and using inner layers features are the best performing strategies, with the former yielding slightly superior results.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122334841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards More Accurate Radio Telescope Images","authors":"Nezihe Merve Gurel, P. Hurley, Matthieu Simeoni","doi":"10.1109/CVPRW.2018.00254","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00254","url":null,"abstract":"Radio interferometry usually compensates for high levels of noise in sensor/antenna electronics by throwing data and energy at the problem: observe longer, then store and process it all. We propose instead a method to remove the noise explicitly before imaging. To this end, we developed an algorithm that first decomposes the instances of antenna correlation matrix, the so-called visibility matrix, into additive components using Singular Spectrum Analysis and then cluster these components using graph Laplacian matrix. We show through simulation the potential for radio astronomy, in particular, illustrating the benefit for LOFAR, the low frequency array in Netherlands. Least-squares images are estimated with far higher accuracy with low computation cost without the need for long observation time.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114219591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paolo Di Febbo, Carlo Dal Mutto, Kinh H. Tieu, S. Mattoccia
{"title":"KCNN: Extremely-Efficient Hardware Keypoint Detection with a Compact Convolutional Neural Network","authors":"Paolo Di Febbo, Carlo Dal Mutto, Kinh H. Tieu, S. Mattoccia","doi":"10.1109/CVPRW.2018.00111","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00111","url":null,"abstract":"Keypoint detection algorithms are typically based on handcrafted combinations of derivative operations implemented with standard image filtering approaches. The early layers of Convolutional Neural Networks (CNNs) for image classification, whose implementation is nowadays often available within optimized hardware units, are characterized by a similar architecture. Therefore, the exploration of CNNs for keypoint detection is a promising avenue to obtain a low-latency implementation, also enabling to effectively move the computational cost of the detection to dedicated Neural Network processing units. This paper proposes a methodology for effective keypoint detection by means of an efficient CNN characterized by a compact three-layer architecture. A novel training procedure is proposed for learning values of the network parameters which allow for an approximation of the response of handcrafted detectors, showing that the proposed architecture is able to obtain results comparable with the state of the art. The capability of emulating different detectors allows to deploy a variety of algorithms to dedicated hardware by simply retraining the network. A sensor-based FPGA implementation of the introduced CNN architecture is presented, allowing latency smaller than 1 [ms].","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117033109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vehicle Tracking and Speed Estimation from Traffic Videos","authors":"Shuai Hua, M. Kapoor, D. Anastasiu","doi":"10.1109/CVPRW.2018.00028","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00028","url":null,"abstract":"The rapid recent advancements in the computation ability of everyday computers have made it possible to widely apply deep learning methods to the analysis of traffic surveillance videos. Traffic flow prediction, anomaly detection, vehicle re-identification, and vehicle tracking are basic components in traffic analysis. Among these applications, traffic flow prediction, or vehicle speed estimation, is one of the most important research topics of recent years. Good solutions to this problem could prevent traffic collisions and help improve road planning by better estimating transit demand. In the 2018 NVIDIA AI City Challenge, we combine modern deep learning models with classic computer vision approaches to propose an efficient way to predict vehicle speed. In this paper, we introduce some state-of-the-art approaches in vehicle speed estimation, vehicle detection, and object tracking, as well as our solution for Track 1 of the Challenge.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115052809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Roy, Diangarti Bhalang Tariang, R. Chakraborty, R. Naskar
{"title":"Discrete Cosine Transform Residual Feature Based Filtering Forgery and Splicing Detection in JPEG Images","authors":"A. Roy, Diangarti Bhalang Tariang, R. Chakraborty, R. Naskar","doi":"10.1109/CVPRW.2018.00205","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00205","url":null,"abstract":"Digital images are one of the primary modern media for information interchange. However, digital images are vulnerable to interception and manipulation due to the wide availability of image editing software tools. Filtering forgery detection and splicing detection are two of the most important problems in digital image forensics. In particular, the primary challenge for the filtering forgery detection problem is that typically the techniques effective for nonlinear filtering (e.g. median filtering) detection are quite ineffective for linear filtering detection, and vice versa. In this paper, we have used Discrete Cosine Transform Residual features to train a Support Vector Machine classifier, and have demonstrated its effectiveness for both linear and non-linear filtering (specifically, Median Filtering) detection and filter classification, as well as re-compression based splicing detection in JPEG images. We have also theoretically justified the choice of the abovementioned feature set for both type of forgeries. Our technique outperforms the state-of-the-art forensic techniques for filtering detection, filter classification and re-compression based splicing detection, when applied on a set of standard benchmark images.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116664498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dena Bazazian, Dimosthenis Karatzas, Andrew D. Bagdanov
{"title":"Word Spotting in Scene Images Based on Character Recognition","authors":"Dena Bazazian, Dimosthenis Karatzas, Andrew D. Bagdanov","doi":"10.1109/CVPRW.2018.00244","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00244","url":null,"abstract":"In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121562565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Masaya Kaneko, Kazuya Iwami, Toru Ogawa, T. Yamasaki, K. Aizawa
{"title":"Mask-SLAM: Robust Feature-Based Monocular SLAM by Masking Using Semantic Segmentation","authors":"Masaya Kaneko, Kazuya Iwami, Toru Ogawa, T. Yamasaki, K. Aizawa","doi":"10.1109/CVPRW.2018.00063","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00063","url":null,"abstract":"In this paper, we propose a novel method that combines monocular visual simultaneous localization and mapping (vSLAM) and deep-learning-based semantic segmentation. For stable operation, vSLAM requires feature points on static objects. In conventional vSLAM, random sample consensus (RANSAC) [5] is used to select those feature points. However, if a major portion of the view is occupied by moving objects, many feature points become inappropriate and RANSAC does not perform well. Based on our empirical studies, feature points in the sky and on cars often cause errors in vSLAM. We propose a new framework to exclude feature points using a mask produced by semantic segmentation. Excluding feature points in masked areas enables vSLAM to stably estimate camera motion. We apply ORB-SLAM [15] in our framework, which is a state-of-the-art implementation of monocular vSLAM. For our experiments, we created vSLAM evaluation datasets by using the CARLA simulator [3] under various conditions. Compared to state-of-the-art methods, our method can achieve significantly higher accuracy.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127949555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automated Virtual Navigation and Monocular Localization of Indoor Spaces from Videos","authors":"Qiong Wu, A. Li","doi":"10.1109/CVPRW.2018.00202","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00202","url":null,"abstract":"3D virtual navigation and localization in large indoor spaces (i.e., shopping malls and offices) are usually two separate studied problems. In this paper, we propose an automated framework to publish both 3D virtual navigation and monocular localization services that only require videos (or burst of images) of the environment as input. The framework can unify two problems as one because the collected data are highly utilized for both problems, 3D visual model reconstruction and training data for monocular localization. The power of our approach is that it does not need any human label data and instead automates the process of two separate services based on raw video (or burst of images) data captured by a common mobile device. We build a prototype system that publishes both virtual navigation and localization services for a shopping mall using raw video (or burst of images) data as inputs. Two web applications are developed utilizing two services. One allows navigation in 3D following the original video traces, and user can also stop at any time to explore in 3D space. One allows a user to acquire his/her location by uploading an image of the venue. Because of low barrier of data acquirement, this makes our system widely applicable to a variety of domains and significantly reduces service cost for potential customers.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128981268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep-BCN: Deep Networks Meet Biased Competition to Create a Brain-Inspired Model of Attention Control","authors":"Hossein Adeli, G. Zelinsky","doi":"10.1109/CVPRW.2018.00259","DOIUrl":"https://doi.org/10.1109/CVPRW.2018.00259","url":null,"abstract":"The mechanism of attention control is best described by biased-competition theory (BCT), which suggests that a top-down goal state biases a competition among object representations for the selective routing of a visual input for classification. Our work advances this theory by making it computationally explicit as a deep neural network (DNN) model, thereby enabling predictions of goal-directed attention control using real-world stimuli. This model, which we call Deep-BCN, is built on top of an 8-layer DNN pre-trained for object classification, but has layers mapped to early visual (V1, V2/V3, V4), ventral (PIT, AIT), and frontal (PFC) brain areas that have their functional connectivity informed by BCT. Deep-BCN also has a superior colliculus and a frontal-eye field, and can therefore make eye movements. We compared Deep-BCN's eye movements to those made from 15 people performing a categorical search for one of 25 target object categories, and found that it predicted both the number of fixations during search and the saccade-distance travelled before search termination. With Deep-BCN a DNN implementation of BCT now exists, which can be used to predict the neural and behavioral responses of an attention control mechanism as it mediates a goal-directed behavior-in our study the eye movements made in search of a target goal.","PeriodicalId":150600,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130536898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}