2020 IEEE International Symposium on Multimedia (ISM)最新文献_第2页

Computational Method for Optimal Attack Play Consisting of Run Plays and Hand-pass Plays for Seven-a-side Rugby 七人制橄榄球中由跑动和手传组成的最佳进攻战术的计算方法

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00031

Kotaro Yashiro, Yohei Nakada

引用次数: 3

An Effective Rotational Invariant Key-point Detector for Image Matching 一种有效的图像匹配旋转不变性关键点检测器

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00043

Thanh Hong-Phuoc, L. Guan

引用次数: 0

Real-time Spatio-Temporal Action Localization in 360 Videos 360视频中的实时时空动作定位

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00018

Bo Chen, A. Ali-Eldin, P. Shenoy, K. Nahrstedt

引用次数: 1

Multimodal Classification of Emotions in Latin Music 拉丁音乐中情感的多模态分类

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00038

L. G. Catharin, Rafael P. Ribeiro, C. Silla, Yandre M. G. Costa, V. D. Feltrim

{"title":"Multimodal Classification of Emotions in Latin Music","authors":"L. G. Catharin, Rafael P. Ribeiro, C. Silla, Yandre M. G. Costa, V. D. Feltrim","doi":"10.1109/ISM.2020.00038","DOIUrl":"https://doi.org/10.1109/ISM.2020.00038","url":null,"abstract":"In this study we classified the songs of the Latin Music Mood Database (LMMD) according to their emotion using two approaches: single-step classification, which consists of classifying the songs by emotion, valence, arousal and quadrant; and multistep classification, which consists of using the predictions of the best valence and arousal classifiers to classify quadrants and the best valence, arousal and quadrant predictions as features to classify emotions. Our hypothesis is that breaking the emotion classification in smaller problems would reduce complexity and improve results. Our best single-step emotion and valence classifiers used multimodal sets of features extracted from lyrics and audio. Our best arousal classifier used features extracted from lyrics and SMOTE to mitigate the dataset imbalance. The proposed multistep emotion classifier, which uses the predictions of a multistep quadrant classifier, improved the single-step classifier performance, reaching 0.605 of mean f-measure. These results show that using valence, arousal, and consequently, quadrant information can improve the prediction of specific emotions.","PeriodicalId":120972,"journal":{"name":"2020 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130585477","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SEAWARE: Semantic Aware View Prediction System for 360-degree Video Streaming SEAWARE: 360度视频流语义感知视图预测系统

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00016

Jounsup Park, Mingyuan Wu, Kuan-Ying Lee, Bo Chen, K. Nahrstedt, M. Zink, R. Sitaraman

{"title":"SEAWARE: Semantic Aware View Prediction System for 360-degree Video Streaming","authors":"Jounsup Park, Mingyuan Wu, Kuan-Ying Lee, Bo Chen, K. Nahrstedt, M. Zink, R. Sitaraman","doi":"10.1109/ISM.2020.00016","DOIUrl":"https://doi.org/10.1109/ISM.2020.00016","url":null,"abstract":"Future view prediction for a 360-degree video streaming system is important to save the network bandwidth and improve the Quality of Experience (QoE). Historical view data of a single viewer and multiple viewers have been used for future view prediction. Video semantic information is also useful to predict the viewer's future behavior. However, extracting video semantic information requires powerful computing hardware and large memory space to perform deep learning-based video analysis. It is not a desirable condition for most of client devices, such as small mobile devices or Head Mounted Display (HMD). Therefore, we develop an approach where video semantic analysis is executed on the media server, and the analysis results are shared with clients via the Semantic Flow Descriptor (SFD) and View-Object State Machine (VOSM). SFD and VOSM become new descriptive additions of the Media Presentation Description (MPD) and Spatial Relation Description (SRD) to support 360-degree video streaming. Using the semantic-based approach, we design the Semantic-Aware View Prediction System (SEAWARE) to improve the overall view prediction performance. The evaluation results of 360-degree videos and real HMD view traces show that the SEAWARE system improves the view prediction performance and streams high-quality video with limited network bandwidth.","PeriodicalId":120972,"journal":{"name":"2020 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131166589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Deriving Strategies for the Evaluation of Spaced Repetition Learning in Mobile Learning Applications from Learning Analytics 基于学习分析的移动学习应用中间隔重复学习的评估策略

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00049

Florian Schimanke, R. Mertens

引用次数: 1

Two types of flows admission control method for maximizing all user satisfaction considering seek-bar operation 考虑搜索栏操作的两种流量允许控制方法，以最大化所有用户满意度

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00048

Keisuke Ode, S. Miyata

引用次数: 0

Structured Pruning of LSTMs via Eigenanalysis and Geometric Median for Mobile Multimedia and Deep Learning Applications 基于特征分析和几何中值的lstm结构化剪枝在移动多媒体和深度学习中的应用

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00028

Nikolaos Gkalelis, V. Mezaris

引用次数: 3

SumBot: Summarize Videos Like a Human SumBot:像人类一样总结视频

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ISM.2020.00044

Hongxiang Gu, Stefano Petrangeli, Viswanathan Swaminathan

{"title":"SumBot: Summarize Videos Like a Human","authors":"Hongxiang Gu, Stefano Petrangeli, Viswanathan Swaminathan","doi":"10.1109/ISM.2020.00044","DOIUrl":"https://doi.org/10.1109/ISM.2020.00044","url":null,"abstract":"Video currently accounts for 70% of all internet traffic and this number is expected to continue to grow. Each minute, more than 500 hours worth of videos are uploaded on YouTube. Generating engaging short videos out of the raw captured content is often a time-consuming and cumbersome activity for content creators. Existing ML- based video summarization and highlight generation approaches often neglect the fact that many summarization tasks require specific domain knowledge of the video content, and that human editors often follow a semistructured template when creating the summary (e.g. to create the highlights for a sport event). We therefore address in this paper the challenge of creating domain-specific summaries, by actively leveraging this editorial template. Particularly, we present an Inverse Reinforcement Learning (IRL)-based framework that can automatically learn the hidden structure or template followed by a human expert when generating a video summary for a specific domain. Particularly, we propose to formulate the video summarization task as a Markov Decision Process, where each state is a combination of the features of the video shots added to the summary, and the possible actions are to include/remove a shot from the summary or leave it as is. Using a set of domain-specific human-generated video highlights as examples, we employ a Maximum Entropy IRL algorithm to learn the implicit reward function governing the summary generation process. The learned reward function is then used to train an RL-agent that can produce video summaries for a specific domain, closely resembling what a human expert would create. Learning from expert demonstrations allows our approach to be applicable to any domain or editorial styles. To demonstrate the superior performance of our approach, we employ it to the task of soccer games highlight generation and show that it outperforms other state-of-the-art methods, both quantitatively and qualitatively.","PeriodicalId":120972,"journal":{"name":"2020 IEEE International Symposium on Multimedia (ISM)","volume":"330 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122831691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

2020 IEEE International Symposium on Multimedia ISM 2020 2020 IEEE多媒体ISM国际研讨会

2020 IEEE International Symposium on Multimedia (ISM) Pub Date : 2020-12-01 DOI: 10.1109/ism.2020.00001

引用次数: 0