Çağlar Aytekin, Xingyang Ni, Francesco Cricri, Lixin Fan, Emre B. Aksu
{"title":"基于网格化超像素的高效内存深度显著目标分割网络","authors":"Çağlar Aytekin, Xingyang Ni, Francesco Cricri, Lixin Fan, Emre B. Aksu","doi":"10.1109/MMSP.2018.8547096","DOIUrl":null,"url":null,"abstract":"Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need of having such memory demanding networks for a reasonable performance in salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048% of the number of parameters that a majority of other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a much more memory-efficient alternative to them. Due to its easy deployment and small size, such a network is preferable for applications in memory limited IoT devices.","PeriodicalId":137522,"journal":{"name":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels\",\"authors\":\"Çağlar Aytekin, Xingyang Ni, Francesco Cricri, Lixin Fan, Emre B. Aksu\",\"doi\":\"10.1109/MMSP.2018.8547096\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need of having such memory demanding networks for a reasonable performance in salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048% of the number of parameters that a majority of other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a much more memory-efficient alternative to them. Due to its easy deployment and small size, such a network is preferable for applications in memory limited IoT devices.\",\"PeriodicalId\":137522,\"journal\":{\"name\":\"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)\",\"volume\":\"78 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MMSP.2018.8547096\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 20th International Workshop on Multimedia Signal Processing (MMSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MMSP.2018.8547096","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels
Computer vision algorithms with pixel-wise labeling tasks, such as semantic segmentation and salient object detection, have gone through a significant accuracy increase with the incorporation of deep learning. Deep segmentation methods slightly modify and fine-tune pre-trained networks that have hundreds of millions of parameters. In this work, we question the need of having such memory demanding networks for a reasonable performance in salient object segmentation. To this end, we propose a way to learn a memory-efficient network from scratch by training it only on salient object detection datasets. Our method encodes images to gridized superpixels that preserve both the object boundaries and the connectivity rules of regular pixels. This representation allows us to use convolutional neural networks that operate on regular grids. By using these encoded images, we train a memory-efficient network using only 0.048% of the number of parameters that a majority of other deep salient object detection networks have. Our method shows comparable accuracy with the state-of-the-art deep salient object detection methods and provides a much more memory-efficient alternative to them. Due to its easy deployment and small size, such a network is preferable for applications in memory limited IoT devices.