Image-Based Storytelling Using Deep Learning

Proceedings of the 5th International Conference on Control and Computer Vision Pub Date : 2022-08-19 DOI:10.1145/3561613.3561641

Yulin Zhu, Wei Yan

引用次数: 0

Abstract

In order to describe a journey, a story could be automatically generated from a group of digital photographs. Most of the existing methods focus on descriptions of specific content of a single image, such as image captioning, which lack of correlation between the images and the spatiotemporal relationships. To this end, in this paper, our goal is to propose a novel storytelling architecture based on computer vision. It makes use of visual object detection from digital images. Combining the changes in spatiotemporal domain and filling in the predetermined template, we automatically generate a text-based travel diary. In this project, compared with conventional image captioning, our aims are to effectively connect correlation between digital images and background information. The contributions of this paper are: (1) Innovative use of preset templates to generate travel diaries from photographs, associating content and context of the images as an event, (3) augmenting the images to expand the dataset, (4) shortening training time of deep learning models.

查看原文本刊更多论文

使用深度学习的基于图像的故事叙述

为了描述一次旅行，一组数码照片可以自动生成一个故事。现有的方法大多侧重于对单幅图像的具体内容进行描述，如图像字幕，缺乏图像之间的相关性和时空关系。为此，在本文中，我们的目标是提出一种基于计算机视觉的新颖的讲故事架构。它利用数字图像的视觉目标检测。结合时空域的变化和预定模板的填充，我们自动生成基于文本的旅行日记。在这个项目中，与传统的图像字幕相比，我们的目标是有效地连接数字图像与背景信息之间的相关性。本文的贡献有:(1)创新地使用预设模板从照片中生成旅行日记，将图像的内容和上下文作为事件关联起来;(3)增强图像以扩展数据集;(4)缩短深度学习模型的训练时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 5th International Conference on Control and Computer Vision

自引率

0.00%

发文量