{"title":"Improvement of Embedding Channel-Wise Activation in Soft-Attention Neural Image Captioning","authors":"Yanke Li","doi":"10.1145/3271553.3271592","DOIUrl":null,"url":null,"abstract":"The paper dives into the topic of image captioning with the soft attention algorithm. We first review relevant works on the captioned topic in terms of background introduction and then explains the original model in details. On top of the plain soft attention model, we propose two approaches for further improvements: SE attention model which adds an extra channel-wise activation layer, and bi-directional attention model that explores two-way attention order feasibility. We implement both methods under limited experiment conditions and in addition swap the original encoder with state-of-art structure. Quantitative results and example demonstrations show that our proposed methods have achieved better performance than baselines. In the end, some suggestions of future work on top of proposed are summarized for a purpose of completeness.","PeriodicalId":414782,"journal":{"name":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","volume":"227 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Conference on Vision, Image and Signal Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3271553.3271592","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The paper dives into the topic of image captioning with the soft attention algorithm. We first review relevant works on the captioned topic in terms of background introduction and then explains the original model in details. On top of the plain soft attention model, we propose two approaches for further improvements: SE attention model which adds an extra channel-wise activation layer, and bi-directional attention model that explores two-way attention order feasibility. We implement both methods under limited experiment conditions and in addition swap the original encoder with state-of-art structure. Quantitative results and example demonstrations show that our proposed methods have achieved better performance than baselines. In the end, some suggestions of future work on top of proposed are summarized for a purpose of completeness.