2015 IEEE International Symposium on Multimedia (ISM)最新文献

筛选
英文 中文
Personalized Indexing of Attention in Lectures -- Requirements and Concept 讲座中注意力的个性化索引——要求与概念
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.44
Sebastian Pospiech, N. Birnbaum, L. Knipping, R. Mertens
{"title":"Personalized Indexing of Attention in Lectures -- Requirements and Concept","authors":"Sebastian Pospiech, N. Birnbaum, L. Knipping, R. Mertens","doi":"10.1109/ISM.2015.44","DOIUrl":"https://doi.org/10.1109/ISM.2015.44","url":null,"abstract":"Web lectures can be employed in a variety of didactic scenarios ranging from add-on for a live lecture to stand-alone learning content. In all of these scenarios, though less in the stand-alone one, indexing and navigation are crucial for real world usability. As a consequence, many approaches like slide based indexing, transcript based indexing, collaborative manual indexing as well as individual or social indexing based on viewing behavior have been devised. The approach proposed in this paper takes individual indexing based on viewing behavior two steps further in that (a) indexes the recording at production time in the lecture hall and (b) actively analyzes the students attention focus instead of passively recording viewing time as done in conventional footprinting. In order to track student attention during the lecture, recoding and analyzing the student's behaviour in parallel to the lecture as well as synchronizing both data streams is necessary. This paper discusses the architecture required for personalized attention based indexing, possible problems and strategies to tackle them.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123122036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Employing Sensors and Services Fusion to Detect and Assess Driving Events 利用传感器和服务融合检测和评估驾驶事件
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.121
Seyed Vahid Hosseinioun, Hussein Al Osman, Abdulmotaleb El Saddik
{"title":"Employing Sensors and Services Fusion to Detect and Assess Driving Events","authors":"Seyed Vahid Hosseinioun, Hussein Al Osman, Abdulmotaleb El Saddik","doi":"10.1109/ISM.2015.121","DOIUrl":"https://doi.org/10.1109/ISM.2015.121","url":null,"abstract":"With the remarkable increase in use of sensors in our daily lives, various methods have been devised to detect events in a driving environment using smart-phones as they provide two main advantages: they eliminate the need to have dedicated hardware in vehicles and they are widely accessible. Since rewarding safe driving is an important issue for insurance companies, some companies are implementing Usage-Based Insurance (UBI) as opposed to traditional History-Based plans. The collection of driving events, such as acceleration and turning, is a prerequisite requirement for the adoption of such plans. Mobile phone sensors are capable of detecting whether a car is accelerating or braking, while through service fusion we can detect other events like speeding or instances of severe weather. We propose a new and robust hybrid classification algorithm that detects acceleration-based events with an F1-score of 0.9304 and turn events with an F1-score of 0.9038. We further propose a method for measuring the driving performance index using the detected events.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120960570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Exploring the Complementarity of Audio-Visual Structural Regularities for the Classification of Videos into TV-Program Collections 视像分类为电视节目集的视听结构规律互补性探讨
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.133
G. Sargent, P. Hanna, H. Nicolas, F. Bimbot
{"title":"Exploring the Complementarity of Audio-Visual Structural Regularities for the Classification of Videos into TV-Program Collections","authors":"G. Sargent, P. Hanna, H. Nicolas, F. Bimbot","doi":"10.1109/ISM.2015.133","DOIUrl":"https://doi.org/10.1109/ISM.2015.133","url":null,"abstract":"This article proposes to analyze the structural regularities from the audio and video streams of TV-programs and explore their potential for the classification of videos into program collections. Our approach is based on the spectral analysis of distance matrices representing the short-and long-term dependancies within the audio and visual modalities of a video. We propose to compare two videos by their respective spectral features. We appreciate the benefits brought by the two modalities on the performances in the context of a K-nearest neighbor classification, and we test our approach in the context of an unsupervised clustering algorithm. These evaluations are performed on two datasets of French and Italian TV programs.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116119635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Novel Two Pass Rate Control Scheme for Variable Bit Rate Video Streaming 一种新的可变比特率视频流的双通率控制方案
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.32
M. VenkataPhaniKumar, K. C. R. C. Varma, S. Mahapatra
{"title":"A Novel Two Pass Rate Control Scheme for Variable Bit Rate Video Streaming","authors":"M. VenkataPhaniKumar, K. C. R. C. Varma, S. Mahapatra","doi":"10.1109/ISM.2015.32","DOIUrl":"https://doi.org/10.1109/ISM.2015.32","url":null,"abstract":"In this paper, a novel two-pass rate control scheme is proposed to achieve a consistent visual quality media for variable bit rate (VBR) video streaming. The rate-distortion (RD) characteristics of each frame is used to establish a frame complexity model, which is later used along with statistics collected in the first-pass to derive an optimal quantization parameter for encoding the frame in the second-pass. The experimental results demonstrate that the proposed rate control scheme significantly outperforms the existing rate control mechanism in the Joint Model (JM) reference software in terms of the Peak Signal to Noise Ratio (PSNR) and consistent perceptual visual quality while achieving the target bit rate. Further, the proposed scheme is validated through implementation on a miniature test-bed.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133164391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improvement of Image and Video Matting with Multiple Reliability Maps 基于多可靠性映射的图像和视频抠图改进
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.28
Takahiro Hayashi, Masato Ishimori, N. Ishii, K. Abe
{"title":"Improvement of Image and Video Matting with Multiple Reliability Maps","authors":"Takahiro Hayashi, Masato Ishimori, N. Ishii, K. Abe","doi":"10.1109/ISM.2015.28","DOIUrl":"https://doi.org/10.1109/ISM.2015.28","url":null,"abstract":"In this paper, we propose a framework for extending existing matting methods to actualize more reliable alpha estimation. The key idea of the framework is integration of multiple alpha maps based on their reliabilities. In the proposed framework, the given input image is converted into multiple grayscale images having various luminance appearances. Then, alpha maps are generated corresponding to these grayscale images by utilizing an existing matting method. At the same time reliability maps (single channel images visualizing the reliabilities of the estimated alpha values) are generated. Finally, by combining alpha maps having the highest reliabilities in each local region, one reliable alpha map is generated. The experimental results have shown that reliable alpha estimation can be actualized by the proposed framework.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122761675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps 联合视频和稀疏三维变换域协同滤波的飞行时间深度图
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.112
T. Hach, Tamara Seybold, H. Böttcher
{"title":"Joint Video and Sparse 3D Transform-Domain Collaborative Filtering for Time-of-Flight Depth Maps","authors":"T. Hach, Tamara Seybold, H. Böttcher","doi":"10.1109/ISM.2015.112","DOIUrl":"https://doi.org/10.1109/ISM.2015.112","url":null,"abstract":"This paper proposes a novel strategy for depth video denoising in RGBD camera systems. Today's depth map sequences obtained by state-of-the-art Time-of-Flight sensors suffer from high temporal noise. All high-level RGB video renderings based on the accompanied depth map's 3D geometry like augmented reality applications will have severe temporal flickering artifacts. We approached this limitation by decoupling depth map upscaling from the temporal denoising step. Thereby, denoising is processed on raw pixels including uncorrelated pixel-wise noise distributions. Our denoising methodology utilizes joint sparse 3D transform-domain collaborative filtering. Therein, we extract RGB texture information to yield a more stable and accurate highly sparse 3D depth block representation for the consecutive shrinkage operation. We show the effectiveness of our method on real RGBD camera data and on a publicly available synthetic data set. The evaluation reveals that our method is superior to state-of-the-art methods. Our method delivers improved flicker-free depth video streams for future applications, which are especially sensitive to temporal noise and arbitrary depth artifacts.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122908200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient Multi-training Framework of Image Deep Learning on GPU Cluster 基于GPU集群的图像深度学习高效多训练框架
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.119
Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin
{"title":"Efficient Multi-training Framework of Image Deep Learning on GPU Cluster","authors":"Chun-Fu Chen, G. Lee, Yinglong Xia, Wan-Yi Sabrina Lin, T. Suzumura, Ching-Yung Lin","doi":"10.1109/ISM.2015.119","DOIUrl":"https://doi.org/10.1109/ISM.2015.119","url":null,"abstract":"In this paper, we develop a pipelining schema for image deep learning on GPU cluster to leverage heavy workload of training procedure. In addition, it is usually necessary to train multiple models to obtain a good deep learning model due to the limited a priori knowledge on deep neural network structure. Therefore, adopting parallel and distributed computing appears is an obvious path forward, but the mileage varies depending on how amenable a deep network can be parallelized and the availability of rapid prototyping capabilities with low cost of entry. In this work, we propose a framework to organize the training procedures of multiple deep learning models into a pipeline on a GPU cluster, where each stage is handled by a particular GPU with a partition of the training dataset. Instead of frequently migrating data among the disks, CPUs, and GPUs, our framework only moves partially trained models to reduce bandwidth consumption and to leverage the full computation capability of the cluster. In this paper, we deploy the proposed framework on popular image recognition tasks using deep learning, and the experiments show that the proposed method reduces overall training time up to dozens of hours compared to the baseline method.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125674733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning 使用轨迹级行为学习的交互式人群内容生成和分析
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.89
Sujeong Kim, Aniket Bera, Dinesh Manocha
{"title":"Interactive Crowd Content Generation and Analysis Using Trajectory-Level Behavior Learning","authors":"Sujeong Kim, Aniket Bera, Dinesh Manocha","doi":"10.1109/ISM.2015.89","DOIUrl":"https://doi.org/10.1109/ISM.2015.89","url":null,"abstract":"We present an interactive approach for analyzing crowd videos and generating content for multimedia applications. Our formulation combines online tracking algorithms from computer vision, non-linear pedestrian motion models from computer graphics, and machine learning techniques to automatically compute the trajectory-level pedestrian behaviors for each agent in the video. These learned behaviors are used to detect anomalous behaviors, perform crowd replication, augment crowd videos with virtual agents, and segment the motion of pedestrians. We demonstrate the performance of these tasks using indoor and outdoor crowd video benchmarks consisting of tens of human agents, moreover, our algorithm takes less than a tenth of a second per frame on a multi-core PC. The overall approach can handle dense and heterogeneous crowd behaviors and is useful for realtime crowd scene analysis applications.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129745317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks 基于结构相似度的无线视频传输失真估计
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.88
Arun Sankisa, A. Katsaggelos, P. Pahalawatta
{"title":"Distortion Estimation Using Structural Similarity for Video Transmission over Wireless Networks","authors":"Arun Sankisa, A. Katsaggelos, P. Pahalawatta","doi":"10.1109/ISM.2015.88","DOIUrl":"https://doi.org/10.1109/ISM.2015.88","url":null,"abstract":"Efficient streaming of video over wireless networks requires real-time assessment of distortion due to packet loss, especially because predictive coding at the encoder can cause inter-frame propagation of errors and impact the overall quality of the transmitted video. This paper presents an algorithm to evaluate the expected receiver distortion on the source side by utilizing encoder information, transmission channel characteristics and error concealment. Specifically, distinct video transmission units, Group of Blocks (GOBs), are iteratively built at the source by taking into account macroblock coding modes and motion-compensated error concealment for three different combinations of packet loss. Distortion of these units is then calculated using the structural similarity (SSIM) metric and they are stochastically combined to derive the overall expected distortion. The proposed model provides a more accurate estimate of the distortion that closely models quality as perceived through the human visual system. When incorporated into a content-aware utility function, preliminary experimental results show improved packet ordering & scheduling efficiency and overall video signal at the receiver.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116018657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery 基于航空图像地理空间马赛克的自动视频内容摘要
2015 IEEE International Symposium on Multimedia (ISM) Pub Date : 2015-12-01 DOI: 10.1109/ISM.2015.124
R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan
{"title":"Automatic Video Content Summarization Using Geospatial Mosaics of Aerial Imagery","authors":"R. Viguier, Chung-Ching Lin, H. Aliakbarpour, F. Bunyak, Sharath Pankanti, G. Seetharaman, K. Palaniappan","doi":"10.1109/ISM.2015.124","DOIUrl":"https://doi.org/10.1109/ISM.2015.124","url":null,"abstract":"It is estimated that less than five percent of videos are currently analyzed to any degree. In addition to petabyte-sized multimedia archives, continuing innovations in optics, imaging sensors, camera arrays, (aerial) platforms, and storage technologies indicates that for the foreseeable future existing and new applications will continue to generate enormous volumes of video imagery. Contextual video summarizations and activity maps offers one innovative direction to tackling this Big Data problem in computer vision. The goal of this work is to develop semi-automatic exploitation algorithms and tools to increase utility, dissemination and usage potential by providing quick dynamic overview geospatial mosaics and motion maps. We present a framework to summarize (multiple) video streams from unmanned aerial vehicles (UAV) or drones which have very different characteristics compared to structured commercial and consumer videos that have been analyzed in the past. Using both metadata geospatial characteristics of the video combined with fast low-level image-based algorithms, the proposed method first generates mini-mosaics that can then be combined into geo-referenced meta-mosaics imagery. These geospatial maps enable rapid assessment of hours long videos with arbitrary spatial coverage from multiple sensors by generating quick look imagery, composed of multiple mini-mosaics, summarizing spatiotemporal dynamics such as coverage, dwell time, activity, etc. The overall summarization pipeline was tested on several DARPA Video and Image Retrieval and Analysis Tool (VIRAT) datasets. We evaluate the effectiveness of the proposed video summarization framework using metrics such as compression and hours of viewing time.","PeriodicalId":250353,"journal":{"name":"2015 IEEE International Symposium on Multimedia (ISM)","volume":"44 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121012492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信