Automatic Indonesian Image Caption Generation using CNN-LSTM Model and FEEH-ID Dataset

2019 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA) Pub Date : 2019-06-01 DOI:10.1109/CIVEMSA45640.2019.9071632

E. Mulyanto, Esther Irawati Setiawan, E. M. Yuniarno, M. Purnomo

{"title":"Automatic Indonesian Image Caption Generation using CNN-LSTM Model and FEEH-ID Dataset","authors":"E. Mulyanto, Esther Irawati Setiawan, E. M. Yuniarno, M. Purnomo","doi":"10.1109/CIVEMSA45640.2019.9071632","DOIUrl":null,"url":null,"abstract":"Image captioning is a challenge in computer vision research. This paper extends research on automatic image captioning generation in the Indonesian dimension. Description in Indonesian sentences is generated for unlabeled images. The dataset used is FEEH-ID, this is the first Indonesian image captioning dataset. This research is crucial due to unavailability of a corpus for image captioning in Indonesian. This paper will compare the experimental results in the FEEH-ID dataset with English, Chinese and Japanese datasets using the CNN and LSTM models. The performance of the model proposed in the test set provides promising results of 50.0 for the BLEU-1 score and 23.9 for BLEU-3, which is above average of the Bleu evaluation results in other language datasets. The merging model between CNN and LSTM displays pretty good results for the FEEH-ID dataset. The experimental results will be better with a larger dataset.","PeriodicalId":293990,"journal":{"name":"2019 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIVEMSA45640.2019.9071632","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

Image captioning is a challenge in computer vision research. This paper extends research on automatic image captioning generation in the Indonesian dimension. Description in Indonesian sentences is generated for unlabeled images. The dataset used is FEEH-ID, this is the first Indonesian image captioning dataset. This research is crucial due to unavailability of a corpus for image captioning in Indonesian. This paper will compare the experimental results in the FEEH-ID dataset with English, Chinese and Japanese datasets using the CNN and LSTM models. The performance of the model proposed in the test set provides promising results of 50.0 for the BLEU-1 score and 23.9 for BLEU-3, which is above average of the Bleu evaluation results in other language datasets. The merging model between CNN and LSTM displays pretty good results for the FEEH-ID dataset. The experimental results will be better with a larger dataset.

查看原文本刊更多论文

基于CNN-LSTM模型和feh - id数据集的印尼语图像标题自动生成

图像字幕是计算机视觉研究中的一个挑战。本文扩展了印尼语维度图像字幕自动生成的研究。印尼语句子的描述是为未标记的图像生成的。使用的数据集是feh - id，这是印度尼西亚第一个图像字幕数据集。这项研究是至关重要的，因为无法获得印尼语图像字幕的语料库。本文将使用CNN和LSTM模型将feh - id数据集与英语、中文和日语数据集的实验结果进行比较。在测试集中提出的模型的性能得到了很好的结果，Bleu -1得分为50.0,Bleu -3得分为23.9，高于其他语言数据集中Bleu评价结果的平均水平。CNN和LSTM的合并模型在feh - id数据集上显示了相当好的结果。数据集越大，实验结果越好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA)

自引率

0.00%

发文量