{"title":"End to End Background Subtraction by Deep Convolutional Neural Network","authors":"Hongwei Sun","doi":"10.1109/SPAC46244.2018.8965532","DOIUrl":null,"url":null,"abstract":"In this work, we use different methods from predecessors to perform background deductions, and our most important part still uses the network framework of CNN (deep convolution neural network). The advantage of this method is obvious. We can save the process of adjusting parameters and feature engineering. Because in the process of training a single video, it can learn structural network parameters from data. At the same time, we use the SUBSENSE structure to train data. It is a background structure that can monitor the moving objects in the video. It can better detect the foreground objects camouflaged and minimize the influence of unrelated factors such as illumination. Second, we use the long and short memory network (LSTM) to predict the significance of video, which can effectively avoid a limited number of training video over fitting, using our data through training successfully to learn the spatial and temporal significance of the estimation. Through our method, the final saliency prediction has achieved good results.","PeriodicalId":360369,"journal":{"name":"2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAC46244.2018.8965532","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this work, we use different methods from predecessors to perform background deductions, and our most important part still uses the network framework of CNN (deep convolution neural network). The advantage of this method is obvious. We can save the process of adjusting parameters and feature engineering. Because in the process of training a single video, it can learn structural network parameters from data. At the same time, we use the SUBSENSE structure to train data. It is a background structure that can monitor the moving objects in the video. It can better detect the foreground objects camouflaged and minimize the influence of unrelated factors such as illumination. Second, we use the long and short memory network (LSTM) to predict the significance of video, which can effectively avoid a limited number of training video over fitting, using our data through training successfully to learn the spatial and temporal significance of the estimation. Through our method, the final saliency prediction has achieved good results.