2015 IEEE International Symposium on Multimedia (ISM)最新文献_第3页

Personalized Indexing of Attention in Lectures -- Requirements and Concept 讲座中注意力的个性化索引——要求与概念

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.44

Sebastian Pospiech, N. Birnbaum, L. Knipping, R. Mertens

引用次数: 0

Employing Sensors and Services Fusion to Detect and Assess Driving Events 利用传感器和服务融合检测和评估驾驶事件

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.121

Seyed Vahid Hosseinioun, Hussein Al Osman, Abdulmotaleb El Saddik

引用次数: 13

Exploring the Complementarity of Audio-Visual Structural Regularities for the Classification of Videos into TV-Program Collections 视像分类为电视节目集的视听结构规律互补性探讨

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.133

G. Sargent, P. Hanna, H. Nicolas, F. Bimbot

引用次数: 3

A Novel Two Pass Rate Control Scheme for Variable Bit Rate Video Streaming 一种新的可变比特率视频流的双通率控制方案

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.32

M. VenkataPhaniKumar, K. C. R. C. Varma, S. Mahapatra

引用次数: 2

Improvement of Image and Video Matting with Multiple Reliability Maps 基于多可靠性映射的图像和视频抠图改进

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.28

Takahiro Hayashi, Masato Ishimori, N. Ishii, K. Abe

引用次数: 0

Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps 联合视频和稀疏三维变换域协同滤波的飞行时间深度图

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.112

T. Hach, Tamara Seybold, H. Böttcher

{"title":"Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps","authors":"T. Hach, Tamara Seybold, H. Böttcher","doi":"10.1109/ISM.2015.112","DOIUrl":"https://doi.org/10.1109/ISM.2015.112","url":null,"abstract":"This paper proposes a novel strategy for depth video denoising in RGBD camera systems. Today's depth map sequences obtained by state-of-the-art Time-of-Flight sensors suffer from high temporal noise. All high-level RGB video renderings based on the accompanied depth map's 3D geometry like augmented reality applications will have severe temporal flickering artifacts. We approached this limitation by decoupling depth map upscaling from the temporal denoising step. Thereby, denoising is processed on raw pixels including uncorrelated pixel-wise noise distributions. Our denoising methodology utilizes joint sparse 3D transform-domain collaborative filtering. Therein, we extract RGB texture information to yield a more stable and accurate highly sparse 3D depth block representation for the consecutive shrinkage operation. We show the effectiveness of our method on real RGBD camera data and on a publicly available synthetic data set. The evaluation reveals that our method is superior to state-of-the-art methods. Our method delivers improved flicker-free depth video streams for future applications, which are especially sensitive to temporal noise and arbitrary depth artifacts.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122908200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient Multi-training Framework of Image Deep Learning on GPU Cluster 基于GPU集群的图像深度学习高效多训练框架

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.119

Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin

{"title":"Efficient Multi-training Framework of Image Deep Learning on GPU Cluster","authors":"Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin","doi":"10.1109/ISM.2015.119","DOIUrl":"https://doi.org/10.1109/ISM.2015.119","url":null,"abstract":"In this paper, we develop a pipelining schema for image deep learning on GPU cluster to leverage heavy workload of training procedure. In addition, it is usually necessary to train multiple models to obtain a good deep learning model due to the limited a priori knowledge on deep neural network structure. Therefore, adopting parallel and distributed computing appears is an obvious path forward, but the mileage varies depending on how amenable a deep network can be parallelized and the availability of rapid prototyping capabilities with low cost of entry. In this work, we propose a framework to organize the training procedures of multiple deep learning models into a pipeline on a GPU cluster, where each stage is handled by a particular GPU with a partition of the training dataset. Instead of frequently migrating data among the disks, CPUs, and GPUs, our framework only moves partially trained models to reduce bandwidth consumption and to leverage the full computation capability of the cluster. In this paper, we deploy the proposed framework on popular image recognition tasks using deep learning, and the experiments show that the proposed method reduces overall training time up to dozens of hours compared to the baseline method.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125674733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning 使用轨迹级行为学习的交互式人群内容生成和分析

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.89

Sujeong Kim, Aniket Bera, Dinesh Manocha

引用次数: 15

Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks 基于结构相似度的无线视频传输失真估计

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.88

Arun Sankisa, A. Katsaggelos, P. Pahalawatta

{"title":"Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks","authors":"Arun Sankisa, A. Katsaggelos, P. Pahalawatta","doi":"10.1109/ISM.2015.88","DOIUrl":"https://doi.org/10.1109/ISM.2015.88","url":null,"abstract":"Efficient streaming of video over wireless networks requires real-time assessment of distortion due to packet loss, especially because predictive coding at the encoder can cause inter-frame propagation of errors and impact the overall quality of the transmitted video. This paper presents an algorithm to evaluate the expected receiver distortion on the source side by utilizing encoder information, transmission channel characteristics and error concealment. Specifically, distinct video transmission units, Group of Blocks (GOBs), are iteratively built at the source by taking into account macroblock coding modes and motion-compensated error concealment for three different combinations of packet loss. Distortion of these units is then calculated using the structural similarity (SSIM) metric and they are stochastically combined to derive the overall expected distortion. The proposed model provides a more accurate estimate of the distortion that closely models quality as perceived through the human visual system. When incorporated into a content-aware utility function, preliminary experimental results show improved packet ordering & scheduling efficiency and overall video signal at the receiver.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery 基于航空图像地理空间马赛克的自动视频内容摘要

2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.124

R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan

{"title":"Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery","authors":"R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan","doi":"10.1109/ISM.2015.124","DOIUrl":"https://doi.org/10.1109/ISM.2015.124","url":null,"abstract":"It is estimated that less than five percent of videos are currently analyzed to any degree. In addition to petabyte-sized multimedia archives, continuing innovations in optics, imaging sensors, camera arrays, (aerial) platforms, and storage technologies indicates that for the foreseeable future existing and new applications will continue to generate enormous volumes of video imagery. Contextual video summarizations and activity maps offers one innovative direction to tackling this Big Data problem in computer vision. The goal of this work is to develop semi-automatic exploitation algorithms and tools to increase utility, dissemination and usage potential by providing quick dynamic overview geospatial mosaics and motion maps. We present a framework to summarize (multiple) video streams from unmanned aerial vehicles (UAV) or drones which have very different characteristics compared to structured commercial and consumer videos that have been analyzed in the past. Using both metadata geospatial characteristics of the video combined with fast low-level image-based algorithms, the proposed method first generates mini-mosaics that can then be combined into geo-referenced meta-mosaics imagery. These geospatial maps enable rapid assessment of hours long videos with arbitrary spatial coverage from multiple sensors by generating quick look imagery, composed of multiple mini-mosaics, summarizing spatiotemporal dynamics such as coverage, dwell time, activity, etc. The overall summarization pipeline was tested on several DARPA Video and Image Retrieval and Analysis Tool (VIRAT) datasets. We evaluate the effectiveness of the proposed video summarization framework using metrics such as compression and hours of viewing time.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"44 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121012492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15