{"title":"软注意神经图像字幕中嵌入通道激活的改进","authors":"Yanke Li","doi":"10.1145/3271553.3271592","DOIUrl":null,"url":null,"abstract":"The paper dives into the topic of image captioning with the soft attention algorithm. We first review relevant works on the captioned topic in terms of background introduction and then explains the original model in details. On top of the plain soft attention model, we propose two approaches for further improvements: SE attention model which adds an extra channel-wise activation layer, and bi-directional attention model that explores two-way attention order feasibility. We implement both methods under limited experiment conditions and in addition swap the original encoder with state-of-art structure. Quantitative results and example demonstrations show that our proposed methods have achieved better performance than baselines. In the end, some suggestions of future work on top of proposed are summarized for a purpose of completeness.","PeriodicalId":414782,"journal":{"name":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","volume":"227 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improvement of Embedding Channel-Wise Activation in Soft-Attention Neural Image Captioning\",\"authors\":\"Yanke Li\",\"doi\":\"10.1145/3271553.3271592\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper dives into the topic of image captioning with the soft attention algorithm. We first review relevant works on the captioned topic in terms of background introduction and then explains the original model in details. On top of the plain soft attention model, we propose two approaches for further improvements: SE attention model which adds an extra channel-wise activation layer, and bi-directional attention model that explores two-way attention order feasibility. We implement both methods under limited experiment conditions and in addition swap the original encoder with state-of-art structure. Quantitative results and example demonstrations show that our proposed methods have achieved better performance than baselines. In the end, some suggestions of future work on top of proposed are summarized for a purpose of completeness.\",\"PeriodicalId\":414782,\"journal\":{\"name\":\"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing\",\"volume\":\"227 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3271553.3271592\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3271553.3271592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improvement of Embedding Channel-Wise Activation in Soft-Attention Neural Image Captioning
The paper dives into the topic of image captioning with the soft attention algorithm. We first review relevant works on the captioned topic in terms of background introduction and then explains the original model in details. On top of the plain soft attention model, we propose two approaches for further improvements: SE attention model which adds an extra channel-wise activation layer, and bi-directional attention model that explores two-way attention order feasibility. We implement both methods under limited experiment conditions and in addition swap the original encoder with state-of-art structure. Quantitative results and example demonstrations show that our proposed methods have achieved better performance than baselines. In the end, some suggestions of future work on top of proposed are summarized for a purpose of completeness.