{"title":"A novel hybrid model for video salient object detection","authors":"Jinping Cai, Sheng Lin","doi":"10.1109/ICCEIC51584.2020.00059","DOIUrl":null,"url":null,"abstract":"At present, there are a few video salient object detection models that can simulate the attention behavior in the dynamic scene. However, due to the lack of video salient object detection data sets and the camera motion interference, the existing models are insufficient to capture the overall shape and precise boundaries of targets. Hence, a new hybrid model, called NHM, connects the attention feedback network and pyramid dilated convolution module to obtain abundant spatial saliency information, then uses the saliency-shift-aware convLSTM module to learn temporal saliency information. Instead of directly feeding the attention feedback network results into the pyramid dilated convolution module, we extract feature maps of different scales from five decoder blocks and transfer them to the pyramid dilated convolution module. In this way, we could make better use of multi-scale features. Furthermore, a new hybrid loss function is proposed to obtain fine boundaries by introducing the boundary- enhanced loss. The experimental results show that the proposed model is superior or equal to the state-of-the-art models.","PeriodicalId":135840,"journal":{"name":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","volume":"225 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCEIC51584.2020.00059","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
At present, there are a few video salient object detection models that can simulate the attention behavior in the dynamic scene. However, due to the lack of video salient object detection data sets and the camera motion interference, the existing models are insufficient to capture the overall shape and precise boundaries of targets. Hence, a new hybrid model, called NHM, connects the attention feedback network and pyramid dilated convolution module to obtain abundant spatial saliency information, then uses the saliency-shift-aware convLSTM module to learn temporal saliency information. Instead of directly feeding the attention feedback network results into the pyramid dilated convolution module, we extract feature maps of different scales from five decoder blocks and transfer them to the pyramid dilated convolution module. In this way, we could make better use of multi-scale features. Furthermore, a new hybrid loss function is proposed to obtain fine boundaries by introducing the boundary- enhanced loss. The experimental results show that the proposed model is superior or equal to the state-of-the-art models.