Sketched Visual Narratives for Image and Video Search

Proceedings of the 2017 ACM Symposium on Document Engineering Pub Date : 2017-08-31 DOI:10.1145/3103010.3103024

J. Collomosse

{"title":"Sketched Visual Narratives for Image and Video Search","authors":"J. Collomosse","doi":"10.1145/3103010.3103024","DOIUrl":null,"url":null,"abstract":"The internet is transforming into a visual medium; over 80% of the internet is forecast to be visual content by 2018, and most of this content will be consumed on mobile devices featuring a touch-screen as their primary interface. Gestural interaction, such as sketch, presents an intuitive way to interact with these devices. Imagine a Google image search in which you specify your query by sketching the desired image with your finger, rather than (or in addition to) describing it with text words. Sketch offers an orthogonal perspective on visual search - enabling concise specification of appearance (via sketch) in addition to semantics (via text). In this talk, John Collomosse will present a summary of his group's work on the use of free-hand sketches for the visual search and manipulation of images and video. He will begin by describing a scalable system for sketch based search of multi-million image databases, based upon their Gradient Field HOG (GF-HOG) descriptor. He will then describe how deep learning can be used to enhance performance of the retrieval. Imagine a product catalogue in which you sketched, say an engineering part, rather than using a text or serial numbers to find it? John will then describe how scalable search of video can be similarly achieved, through the depiction of sketched visual narratives that depict not only objects but also their motion (dynamics) as a constraint to find relevant video clips. The work presented in this talk has been supported by the EPSRC and AHRC between 2012-2016.","PeriodicalId":200469,"journal":{"name":"Proceedings of the 2017 ACM Symposium on Document Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM Symposium on Document Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3103010.3103024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The internet is transforming into a visual medium; over 80% of the internet is forecast to be visual content by 2018, and most of this content will be consumed on mobile devices featuring a touch-screen as their primary interface. Gestural interaction, such as sketch, presents an intuitive way to interact with these devices. Imagine a Google image search in which you specify your query by sketching the desired image with your finger, rather than (or in addition to) describing it with text words. Sketch offers an orthogonal perspective on visual search - enabling concise specification of appearance (via sketch) in addition to semantics (via text). In this talk, John Collomosse will present a summary of his group's work on the use of free-hand sketches for the visual search and manipulation of images and video. He will begin by describing a scalable system for sketch based search of multi-million image databases, based upon their Gradient Field HOG (GF-HOG) descriptor. He will then describe how deep learning can be used to enhance performance of the retrieval. Imagine a product catalogue in which you sketched, say an engineering part, rather than using a text or serial numbers to find it? John will then describe how scalable search of video can be similarly achieved, through the depiction of sketched visual narratives that depict not only objects but also their motion (dynamics) as a constraint to find relevant video clips. The work presented in this talk has been supported by the EPSRC and AHRC between 2012-2016.

查看原文本刊更多论文

草图视觉叙事图像和视频搜索

互联网正在转变为一种视觉媒介;预计到2018年，超过80%的互联网内容将是视觉内容，其中大部分内容将在以触摸屏为主要界面的移动设备上消费。手势交互(例如sketch)提供了一种与这些设备交互的直观方式。想象一下，在谷歌图像搜索中，您通过用手指勾画所需图像来指定查询，而不是(或除了)用文本单词描述它。Sketch为视觉搜索提供了一个正交的视角——除了语义(通过文本)之外，还提供了简洁的外观规范(通过草图)。在这次演讲中，John Collomosse将介绍他的团队在使用手绘草图进行图像和视频的视觉搜索和操作方面的工作。他将首先描述一个可扩展的系统，用于基于数百万图像数据库的草图搜索，基于他们的梯度场HOG (GF-HOG)描述符。然后，他将描述如何使用深度学习来提高检索的性能。想象一下，在一个产品目录中，你画出了草图，比如说一个工程部件，而不是使用文本或序列号来查找它?然后，John将描述如何通过描绘草图的视觉叙述来实现视频的可扩展搜索，这些描述不仅描述了对象，而且还描述了它们的运动(动态)作为查找相关视频剪辑的约束。本次演讲中介绍的工作在2012-2016年期间得到了EPSRC和AHRC的支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2017 ACM Symposium on Document Engineering

自引率

0.00%

发文量