{"title":"草图视觉叙事图像和视频搜索","authors":"J. Collomosse","doi":"10.1145/3103010.3103024","DOIUrl":null,"url":null,"abstract":"The internet is transforming into a visual medium; over 80% of the internet is forecast to be visual content by 2018, and most of this content will be consumed on mobile devices featuring a touch-screen as their primary interface. Gestural interaction, such as sketch, presents an intuitive way to interact with these devices. Imagine a Google image search in which you specify your query by sketching the desired image with your finger, rather than (or in addition to) describing it with text words. Sketch offers an orthogonal perspective on visual search - enabling concise specification of appearance (via sketch) in addition to semantics (via text). In this talk, John Collomosse will present a summary of his group's work on the use of free-hand sketches for the visual search and manipulation of images and video. He will begin by describing a scalable system for sketch based search of multi-million image databases, based upon their Gradient Field HOG (GF-HOG) descriptor. He will then describe how deep learning can be used to enhance performance of the retrieval. Imagine a product catalogue in which you sketched, say an engineering part, rather than using a text or serial numbers to find it? John will then describe how scalable search of video can be similarly achieved, through the depiction of sketched visual narratives that depict not only objects but also their motion (dynamics) as a constraint to find relevant video clips. The work presented in this talk has been supported by the EPSRC and AHRC between 2012-2016.","PeriodicalId":200469,"journal":{"name":"Proceedings of the 2017 ACM Symposium on Document Engineering","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sketched Visual Narratives for Image and Video Search\",\"authors\":\"J. Collomosse\",\"doi\":\"10.1145/3103010.3103024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The internet is transforming into a visual medium; over 80% of the internet is forecast to be visual content by 2018, and most of this content will be consumed on mobile devices featuring a touch-screen as their primary interface. Gestural interaction, such as sketch, presents an intuitive way to interact with these devices. Imagine a Google image search in which you specify your query by sketching the desired image with your finger, rather than (or in addition to) describing it with text words. Sketch offers an orthogonal perspective on visual search - enabling concise specification of appearance (via sketch) in addition to semantics (via text). In this talk, John Collomosse will present a summary of his group's work on the use of free-hand sketches for the visual search and manipulation of images and video. He will begin by describing a scalable system for sketch based search of multi-million image databases, based upon their Gradient Field HOG (GF-HOG) descriptor. He will then describe how deep learning can be used to enhance performance of the retrieval. Imagine a product catalogue in which you sketched, say an engineering part, rather than using a text or serial numbers to find it? John will then describe how scalable search of video can be similarly achieved, through the depiction of sketched visual narratives that depict not only objects but also their motion (dynamics) as a constraint to find relevant video clips. The work presented in this talk has been supported by the EPSRC and AHRC between 2012-2016.\",\"PeriodicalId\":200469,\"journal\":{\"name\":\"Proceedings of the 2017 ACM Symposium on Document Engineering\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM Symposium on Document Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3103010.3103024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM Symposium on Document Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3103010.3103024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sketched Visual Narratives for Image and Video Search
The internet is transforming into a visual medium; over 80% of the internet is forecast to be visual content by 2018, and most of this content will be consumed on mobile devices featuring a touch-screen as their primary interface. Gestural interaction, such as sketch, presents an intuitive way to interact with these devices. Imagine a Google image search in which you specify your query by sketching the desired image with your finger, rather than (or in addition to) describing it with text words. Sketch offers an orthogonal perspective on visual search - enabling concise specification of appearance (via sketch) in addition to semantics (via text). In this talk, John Collomosse will present a summary of his group's work on the use of free-hand sketches for the visual search and manipulation of images and video. He will begin by describing a scalable system for sketch based search of multi-million image databases, based upon their Gradient Field HOG (GF-HOG) descriptor. He will then describe how deep learning can be used to enhance performance of the retrieval. Imagine a product catalogue in which you sketched, say an engineering part, rather than using a text or serial numbers to find it? John will then describe how scalable search of video can be similarly achieved, through the depiction of sketched visual narratives that depict not only objects but also their motion (dynamics) as a constraint to find relevant video clips. The work presented in this talk has been supported by the EPSRC and AHRC between 2012-2016.