{"title":"A Research on Image Captioning by Different Encoder Networks","authors":"Jieh-Ren Chang, Tsung-Ta Ling, Ting-Chun Li","doi":"10.1109/IS3C50286.2020.00025","DOIUrl":null,"url":null,"abstract":"Many current research issues of image captioning focus on modifying the CNN (Convolutional Neural Network) or RNN (Recurrent Neural Network), while supplementing the attention mechanism to enhance the long-term memory ability of the RNN. However, the relationship with input data and CNN model could be another important point. This paper defines the image complexity to enhance model's accuracy. After analyzing the data set, some criteria of the image complexity are defined according to the image grayscale entropy and the two-dimensional entropy for image Captioning. In this paper, a new model is setup to compare with the other model. Although the result is better than the other model by a revised bilingual evaluation understudy (R-BLEU) evaluation index which is a new calculation formula to evaluate image captioning performance.","PeriodicalId":143430,"journal":{"name":"2020 International Symposium on Computer, Consumer and Control (IS3C)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Symposium on Computer, Consumer and Control (IS3C)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IS3C50286.2020.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Many current research issues of image captioning focus on modifying the CNN (Convolutional Neural Network) or RNN (Recurrent Neural Network), while supplementing the attention mechanism to enhance the long-term memory ability of the RNN. However, the relationship with input data and CNN model could be another important point. This paper defines the image complexity to enhance model's accuracy. After analyzing the data set, some criteria of the image complexity are defined according to the image grayscale entropy and the two-dimensional entropy for image Captioning. In this paper, a new model is setup to compare with the other model. Although the result is better than the other model by a revised bilingual evaluation understudy (R-BLEU) evaluation index which is a new calculation formula to evaluate image captioning performance.