Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos最新文献

筛选
英文 中文
Learning, Understanding and Interaction in Videos 视频中的学习、理解和互动
Manmohan Chandraker
{"title":"Learning, Understanding and Interaction in Videos","authors":"Manmohan Chandraker","doi":"10.1145/3552463.3555837","DOIUrl":"https://doi.org/10.1145/3552463.3555837","url":null,"abstract":"Advances in mobile phone camera technologies and internet connectivity have made videos one of the most intuitive ways to communicate and share experiences. Millions of cameras deployed in our homes, offices and public spaces record videos for purposes ranging across safety, assistance, entertainment and many others. This talk describes some of our recent progress in learning, understanding and interaction with such digital media. It will introduce methods in unsupervised and self-supervised representation learning that allow video solutions to be efficiently deployed with minimal data curation. It will discuss how physical priors or human knowledge are leveraged to understand insights in videos ranging from three-dimensional scene properties to language-based descriptions. It will also illustrate how these insights allow us to augment or interact with digital media with unprecedented photorealism and ease.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"146 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123334308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Panel Discussion: Emerging Topics on Video Summarization 小组讨论:视频摘要的新兴话题
Mohan S. Kankanhalli, Jianquan Liu, Yongkang Wong, Karen Stephen
{"title":"Panel Discussion: Emerging Topics on Video Summarization","authors":"Mohan S. Kankanhalli, Jianquan Liu, Yongkang Wong, Karen Stephen","doi":"10.1145/3552463.3558051","DOIUrl":"https://doi.org/10.1145/3552463.3558051","url":null,"abstract":"With video capture devices becoming widely popular, the amount of video data generated per day has seen a rapid increase over the past few years. Browsing through hours of video data to retrieve useful information is a tedious and boring task. Video Summarization technology has played a crucial role in addressing this issue and is a well-researched topic in the multimedia community. This panel aims to bring together researchers working on relevant backgrounds to discuss emerging topics on video summarization, including recent developments, future directions, challenges, solutions, potential applications and other open problems.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121383995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video Summarization in the Deep Learning Era: Current Landscape and Future Directions
I. Patras
{"title":"Video Summarization in the Deep Learning Era: Current Landscape and Future Directions","authors":"I. Patras","doi":"10.1145/3552463.3554166","DOIUrl":"https://doi.org/10.1145/3552463.3554166","url":null,"abstract":"In this talk we will provide an overview of the field of video summarization with a focus on the developments, the trends and the open challenges in the era of Deep Learning and Big Data. After a brief introduction to the problem, we will provide a broad taxonomy of the works in the area and the recent trends from multiple perspectives, including types of methodologies/architectures; supervision signals; and modalities. We will then present current datasets and evaluation protocols highlighting their limitations and challenges that are faced with respect to it. Finally, we will close by giving our perspective for the challenges in the field and for interesting future directions.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116676675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Narrative Dataset: Towards Goal-Driven Narrative Generation 叙事数据集:朝着目标驱动的叙事生成
Karen Stephen, Rishabh Sheoran, Satoshi Yamazaki
{"title":"Narrative Dataset: Towards Goal-Driven Narrative Generation","authors":"Karen Stephen, Rishabh Sheoran, Satoshi Yamazaki","doi":"10.1145/3552463.3557021","DOIUrl":"https://doi.org/10.1145/3552463.3557021","url":null,"abstract":"In this paper, we propose a new dataset called the Narrative dataset, which is a work in progress, towards generating video and text narratives of complex daily events from long videos, captured from multiple cameras. As most of the existing datasets are collected from publicly available videos such as YouTube videos, there are no datasets targeted towards the task of narrative summarization of complex videos which contains multiple narratives. Hence, we create story plots and conduct video shooting with hired actors to create complex video sets where 3 to 4 narratives happen in each video. In the story plot, a narrative composes of multiple events corresponding to video clips of key human activities. On top of the shot video sets and the story plot, the narrative dataset contains dense annotation of actors, objects, and their relationships for each frame as the facts of narratives. Therefore, narrative dataset richly contains holistic and hierarchical structure of facts, events, and narratives. Moreover, Narrative Graph, a collection of scene graphs of narrative events with their causal relationships, is introduced for bridging the gap between the collection of facts and generation of the summary sentences of a narrative. Beyond related subtasks such as scene graph generation, narrative dataset potentially provide challenges of subtasks for bridging human event clips to narratives.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133931231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Contrastive Representation Learning for Expression Recognition from Masked Face Images 基于对比表征学习的蒙面人脸表情识别
Fanxing Luo, Long Zhao, Yu Wang, Jien Kato
{"title":"Contrastive Representation Learning for Expression Recognition from Masked Face Images","authors":"Fanxing Luo, Long Zhao, Yu Wang, Jien Kato","doi":"10.1145/3552463.3557020","DOIUrl":"https://doi.org/10.1145/3552463.3557020","url":null,"abstract":"With the worldwide spread of COVID-19, people are trying different ways to prevent the spread of the virus. One of the most useful and popular ways is wearing a face mask. Most people wear a face mask when they go out, which makes facial expression recognition become harder. Thus, how to improve the performance of the facial expression recognition model on masked faces is becoming an important issue. However, there is no public dataset that includes facial expressions with masks. Thus, we built two datasets which are a real-world masked facial expression database (VIP-DB) and a man-made masked facial expression database (M-RAF-DB). To reduce the influence of masks, we utilize contrastive representation learning and propose a two-branches network. We study the influence of contrastive learning on our two datasets. Results show that using contrastive representation learning improves the performance of expression recognition from masked face images.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122879311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Soccer Game Summarization using Audio Commentary, Metadata, and Captions 使用音频解说、元数据和字幕的足球比赛摘要
Sushant Gautam, Cise Midoglu, Saeed Shafiee Sabet, Dinesh Baniya Kshatri, P. Halvorsen
{"title":"Soccer Game Summarization using Audio Commentary, Metadata, and Captions","authors":"Sushant Gautam, Cise Midoglu, Saeed Shafiee Sabet, Dinesh Baniya Kshatri, P. Halvorsen","doi":"10.1145/3552463.3557019","DOIUrl":"https://doi.org/10.1145/3552463.3557019","url":null,"abstract":"Soccer is one of the most popular sports globally, and the amount of soccer-related content worldwide, including video footage, audio commentary, team/player statistics, scores, and rankings, is enormous and rapidly growing. Consequently, the generation of multimodal summaries is of tremendous interest for broadcasters and fans alike, as a large percentage of audiences prefer to follow only the main highlights of a game. However, annotating important events and producing summaries often requires expensive equipment and a lot of tedious, cumbersome, manual labour. In this context, recent developments in Artificial Intelligence (AI) have shown great potential. The goal of this work is to create an automated soccer game summarization pipeline using AI. In particular, our focus is on the generation of complete game summaries in continuous text format with length constraints, based on raw game multimedia, as well as readily available game metadata and captions where applicable, using Natural Language Processing (NLP) tools along with heuristics. We curate and extend a number of soccer datasets, implement an end-to-end pipeline for the automatic generation of text summaries, present our preliminary results from the comparative analysis of various summarization methods within this pipeline using different input modalities, and provide a discussion of open challenges in the field of automated game summarization.","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129389724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos 第一届以用户为中心的长视频叙事总结研讨会论文集
{"title":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","authors":"","doi":"10.1145/3552463","DOIUrl":"https://doi.org/10.1145/3552463","url":null,"abstract":"","PeriodicalId":293267,"journal":{"name":"Proceedings of the 1st Workshop on User-centric Narrative Summarization of Long Videos","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117246170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信