Pre-trained CNNs as Feature-Extraction Modules for Image Captioning

Q4 Computer Science
Muhammad Abdelhadie Al-Malla, Assef Jafar, Nada Ghneim
{"title":"Pre-trained CNNs as Feature-Extraction Modules for Image Captioning","authors":"Muhammad Abdelhadie Al-Malla, Assef Jafar, Nada Ghneim","doi":"10.5565/rev/elcvia.1436","DOIUrl":null,"url":null,"abstract":"In this work, we present a thorough experimental study about feature extraction using Convolutional NeuralNetworks (CNNs) for the task of image captioning in the context of deep learning. We perform a set of 72experiments on 12 image classification CNNs pre-trained on the ImageNet [29] dataset. The features areextracted from the last layer after removing the fully connected layer and fed into the captioning model. We usea unified captioning model with a fixed vocabulary size across all the experiments to study the effect of changingthe CNN feature extractor on image captioning quality. The scores are calculated using the standard metrics inimage captioning. We find a strong relationship between the model structure and the image captioning datasetand prove that VGG models give the least quality for image captioning feature extraction among the testedCNNs. In the end, we recommend a set of pre-trained CNNs for each of the image captioning evaluation metricswe want to optimise, and show the connection between our results and previous works. To our knowledge, thiswork is the most comprehensive comparison between feature extractors for image captioning.","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Letters on Computer Vision and Image Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5565/rev/elcvia.1436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

In this work, we present a thorough experimental study about feature extraction using Convolutional NeuralNetworks (CNNs) for the task of image captioning in the context of deep learning. We perform a set of 72experiments on 12 image classification CNNs pre-trained on the ImageNet [29] dataset. The features areextracted from the last layer after removing the fully connected layer and fed into the captioning model. We usea unified captioning model with a fixed vocabulary size across all the experiments to study the effect of changingthe CNN feature extractor on image captioning quality. The scores are calculated using the standard metrics inimage captioning. We find a strong relationship between the model structure and the image captioning datasetand prove that VGG models give the least quality for image captioning feature extraction among the testedCNNs. In the end, we recommend a set of pre-trained CNNs for each of the image captioning evaluation metricswe want to optimise, and show the connection between our results and previous works. To our knowledge, thiswork is the most comprehensive comparison between feature extractors for image captioning.
预训练cnn作为图像字幕的特征提取模块
在这项工作中,我们提出了一个关于在深度学习背景下使用卷积神经网络(cnn)进行图像字幕任务的特征提取的全面实验研究。我们在ImageNet[29]数据集上预训练的12个图像分类cnn上进行了72次实验。在去除完全连接层后,从最后一层提取特征并输入到字幕模型中。我们在所有的实验中使用一个固定词汇量的统一字幕模型来研究改变CNN特征提取器对图像字幕质量的影响。分数是使用标准指标图像字幕计算的。我们发现模型结构与图像字幕数据集之间存在很强的相关性,并证明了在测试的cnn中,VGG模型对图像字幕特征提取的质量是最低的。最后,我们为每个我们想要优化的图像字幕评估指标推荐一组预训练的cnn,并显示我们的结果与以前的工作之间的联系。据我们所知,这项工作是图像字幕特征提取器之间最全面的比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Electronic Letters on Computer Vision and Image Analysis
Electronic Letters on Computer Vision and Image Analysis Computer Science-Computer Vision and Pattern Recognition
CiteScore
2.50
自引率
0.00%
发文量
19
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信