Image-Based Storytelling Using Deep Learning

Yulin Zhu, Wei Yan
{"title":"Image-Based Storytelling Using Deep Learning","authors":"Yulin Zhu, Wei Yan","doi":"10.1145/3561613.3561641","DOIUrl":null,"url":null,"abstract":"In order to describe a journey, a story could be automatically generated from a group of digital photographs. Most of the existing methods focus on descriptions of specific content of a single image, such as image captioning, which lack of correlation between the images and the spatiotemporal relationships. To this end, in this paper, our goal is to propose a novel storytelling architecture based on computer vision. It makes use of visual object detection from digital images. Combining the changes in spatiotemporal domain and filling in the predetermined template, we automatically generate a text-based travel diary. In this project, compared with conventional image captioning, our aims are to effectively connect correlation between digital images and background information. The contributions of this paper are: (1) Innovative use of preset templates to generate travel diaries from photographs, associating content and context of the images as an event, (3) augmenting the images to expand the dataset, (4) shortening training time of deep learning models.","PeriodicalId":348024,"journal":{"name":"Proceedings of the 5th International Conference on Control and Computer Vision","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Control and Computer Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3561613.3561641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In order to describe a journey, a story could be automatically generated from a group of digital photographs. Most of the existing methods focus on descriptions of specific content of a single image, such as image captioning, which lack of correlation between the images and the spatiotemporal relationships. To this end, in this paper, our goal is to propose a novel storytelling architecture based on computer vision. It makes use of visual object detection from digital images. Combining the changes in spatiotemporal domain and filling in the predetermined template, we automatically generate a text-based travel diary. In this project, compared with conventional image captioning, our aims are to effectively connect correlation between digital images and background information. The contributions of this paper are: (1) Innovative use of preset templates to generate travel diaries from photographs, associating content and context of the images as an event, (3) augmenting the images to expand the dataset, (4) shortening training time of deep learning models.
使用深度学习的基于图像的故事叙述
为了描述一次旅行,一组数码照片可以自动生成一个故事。现有的方法大多侧重于对单幅图像的具体内容进行描述,如图像字幕,缺乏图像之间的相关性和时空关系。为此,在本文中,我们的目标是提出一种基于计算机视觉的新颖的讲故事架构。它利用数字图像的视觉目标检测。结合时空域的变化和预定模板的填充,我们自动生成基于文本的旅行日记。在这个项目中,与传统的图像字幕相比,我们的目标是有效地连接数字图像与背景信息之间的相关性。本文的贡献有:(1)创新地使用预设模板从照片中生成旅行日记,将图像的内容和上下文作为事件关联起来;(3)增强图像以扩展数据集;(4)缩短深度学习模型的训练时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信