2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)最新文献

筛选
英文 中文
Pitch Marking Using the Fundamental Signal for Speech Modifications via TDPSOLA 基于TDPSOLA的语音修改基本信号的基音标记
F. Ykhlef, L. Bendaouia
{"title":"Pitch Marking Using the Fundamental Signal for Speech Modifications via TDPSOLA","authors":"F. Ykhlef, L. Bendaouia","doi":"10.1109/ISM.2013.28","DOIUrl":"https://doi.org/10.1109/ISM.2013.28","url":null,"abstract":"The quality of synthetic speech offered by pitch and duration modifications via Time Domain Pitch Synchronous Overlap Add method (TD-PSOLA) relies on an accurate positioning of pitch marks. In this paper, we propose a new pitch marking technique of voiced regions based on the fundamental signal of the speech waveform. By using the valleys of the fundamental signal, we locate a set of precise intervals where the exact instants of pitch marks are expected to be found. The fundamental signal is composed only from the fundamental frequency (pitch) of speech. It is represented by a specific signal named \"mean based signal\" (MBS). The optimal pitch marks are found by extracting the set of global peak instants within the obtained intervals. To improve the performance of the proposed technique, we have proposed a post processing stage which allows us to correct the erroneous pitch marks that may occur due to some synchronization problems. The proposed technique is evaluated on CMU ACRTIC database by using objective and subjective measures. The experiments demonstrate that the proposed technique allows pitch and duration modifications via TD-PSOLA with high quality.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"57 1","pages":"118-124"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80227193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
An Improvement in Media Discovery Service Using Name Spotting 使用名称定位的媒体发现服务的改进
Manish Goswami, Lan Yang
{"title":"An Improvement in Media Discovery Service Using Name Spotting","authors":"Manish Goswami, Lan Yang","doi":"10.1109/ISM.2013.83","DOIUrl":"https://doi.org/10.1109/ISM.2013.83","url":null,"abstract":"Digital Object Repository in the Digital Object Architecture stores a large number of audio/video media files. Lack of metadata in audio/video media files limits the media discovery service in Digital Object Architecture from searching those media files. In this paper we designed a system that uses name spotting module to extract the names, stores the extracted names with audio/video media files, simulates the media discovery service and reports the findings related to the improvement in searching the media file.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"10 1","pages":"427-432"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90252177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Key Recognition Difficulty in Polyphonic Audio 预测复调音频的键识别困难
C. Chuan, Aleksey Charapko
{"title":"Predicting Key Recognition Difficulty in Polyphonic Audio","authors":"C. Chuan, Aleksey Charapko","doi":"10.1109/ISM.2013.82","DOIUrl":"https://doi.org/10.1109/ISM.2013.82","url":null,"abstract":"In this paper, we present statistical models to predict the difficulty of recognizing musical keys from polyphonic audio signals. Automatic audio key finding has been studied for many years, and various approaches have been proposed and reported. Reports of these methods' performance are usually based on the proposers' own data sets. Without details on the data set, i.e., how challenging the data set is, directly comparing the effectiveness of these methods is not meaningful or even possible. Thus, in this study we focus on predicting the difficulty level of key recognition as perceived by human experts. Given an audio recording, represented as the extracted acoustic features, we apply multiple linear regression and proportional odds model to predict the difficulty level of the recording, annotated by experts as an integer on a 5-point Likert scale. We use four metrics to evaluate our prediction results: root mean square error, Pearson correlation coefficient, exact accuracy, and adjacent accuracy. We also examine the difference between experts' annotations and discuss their consistency.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"3 1","pages":"421-426"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83620748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Accurate Detection of Moving Objects in Traffic Video Streams over Limited Bandwidth Networks 有限带宽网络下交通视频流中运动目标的精确检测
Bo-Hao Chen, Shih-Chia Huang
{"title":"Accurate Detection of Moving Objects in Traffic Video Streams over Limited Bandwidth Networks","authors":"Bo-Hao Chen, Shih-Chia Huang","doi":"10.1109/ISM.2013.20","DOIUrl":"https://doi.org/10.1109/ISM.2013.20","url":null,"abstract":"Automated detection of moving objects is an essential task for any intelligent transportation system. However, conventional motion detection techniques often suffer from the loss of moving objects due to bit-rate variation in video streams transmitted via wireless video communication systems. To achieve motion detection that is both reliable and accurate in video streams of variable bit-rate, this paper proposes a novel motion detection approach which is based on grey relational analysis, and which integrates a multi-quality background generation module and a moving object detection module. As our experimental results demonstrate, the proposed approach attained superior motion detection performance compared to other state-of-the-art techniques based on qualitative and quantitative evaluations. Quantitative evaluations produced F1 and Similarity accuracy scores for the proposed approach that were up to 59.96% and 55.42% higher than those of the other compared techniques, respectively.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"13 1","pages":"69-75"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85439868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Nested Event Model for Multimedia Narratives 多媒体叙事的嵌套事件模型
Ricardo Rios M. do Carmo, L. Soares, M. Casanova
{"title":"Nested Event Model for Multimedia Narratives","authors":"Ricardo Rios M. do Carmo, L. Soares, M. Casanova","doi":"10.1109/ISM.2013.26","DOIUrl":"https://doi.org/10.1109/ISM.2013.26","url":null,"abstract":"The proliferation of multimedia narratives has contributed to what is known as the \"crisis of choice\", which demands a much more active participation on the part of the user to consume multimedia content. To address this issue, a strategy is to offer users efficient search mechanisms, sometimes based on ontologies. However, one may argue that such mechanisms are often based on abstractions that do not adequately capture the essential aspects of multimedia narratives. This paper proposes a conceptual model to specify multimedia narratives that overcomes this limitation. The model is based on the notion of event and is therefore called Nested Event Model (NEMo). The paper also includes a complete example to illustrate the use of the model.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"4 1","pages":"106-113"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79975464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Improving Computational Efficiency of 3D Point Cloud Reconstruction from Image Sequences 提高图像序列重建三维点云的计算效率
Chih-Hsiang Chang, N. Kehtarnavaz
{"title":"Improving Computational Efficiency of 3D Point Cloud Reconstruction from Image Sequences","authors":"Chih-Hsiang Chang, N. Kehtarnavaz","doi":"10.1109/ISM.2013.101","DOIUrl":"https://doi.org/10.1109/ISM.2013.101","url":null,"abstract":"The Levenberg-Marquardt optimization is normally used in 3D point cloud reconstruction from image sequences which is computationally expensive. This paper presents a two-stage camera pose estimation approach where an initial camera pose is obtained during the first stage and a refinement is performed during the second stage. This approach does not require the use of the Levenberg-Marquardt optimization and LU matrix decomposition for computing the projection matrix, thus providing a more computationally efficient 3D point cloud reconstruction as compared to the existing approaches. The results obtained using real video sequences indicate that the introduced approach generates lower re-projection errors as well as faster 3D point cloud reconstruction.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"49 6 1","pages":"510-513"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79738347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Cross-Stack Predictive Control Framework for Multimedia Applications 多媒体应用的跨栈预测控制框架
Guangyi Cao, A. Ravindran, S. Kamalasadan, B. Joshi, A. Mukherjee
{"title":"A Cross-Stack Predictive Control Framework for Multimedia Applications","authors":"Guangyi Cao, A. Ravindran, S. Kamalasadan, B. Joshi, A. Mukherjee","doi":"10.1109/ISM.2013.77","DOIUrl":"https://doi.org/10.1109/ISM.2013.77","url":null,"abstract":"We demonstrate a novel cross-stack control theoretic approach in designing a predictive controller that can automatically track changes in the multimedia workload to maintain a desired metric of application quality while minimizing power consumption.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"31 1","pages":"403-404"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75287440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Low Complexity Video Encoding and High Complexity Decoding for UAV Reconnaissance and Surveillance 面向无人机侦察监视的低复杂度视频编码和高复杂度解码
Malavika Bhaskaranand, J. Gibson
{"title":"Low Complexity Video Encoding and High Complexity Decoding for UAV Reconnaissance and Surveillance","authors":"Malavika Bhaskaranand, J. Gibson","doi":"10.1109/ISM.2013.34","DOIUrl":"https://doi.org/10.1109/ISM.2013.34","url":null,"abstract":"Conventional video compression schemes such as H.264/AVC use a high complexity encoder with block motion estimation (ME) and a low complexity, low latency decoder. However, unmanned aerial vehicle (UAV) reconnaissance and surveillance applications require low complexity encoders but can accommodate high complexity decoders. Moreover, the video sequences in these applications often primarily have global motion due to the known movement of the UAV and camera mounts. Motivated by this scenario, we propose and investigate a low complexity encoder with global motion based frame prediction and no block ME. For fly-over videos, our encoder achieves more than a 40% bit rate savings over a H.264 encoder with ME block size restricted to 8 × 8 and at lower complexity. We also develop a high complexity decoder based on Kalman filtering along motion trajectories and show average PSNR improvements of up to 0.5 dB with respect to a classic low complexity decoder.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"24 1","pages":"163-170"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75328108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Efficient Super Resolution Using Edge Directed Unsharp Masking Sharpening Method 有效的超分辨率使用边缘定向不锐利掩蔽锐化方法
Kuo-Shiuan Peng, F. Lin, Yi-Pai Huang, H. Shieh
{"title":"Efficient Super Resolution Using Edge Directed Unsharp Masking Sharpening Method","authors":"Kuo-Shiuan Peng, F. Lin, Yi-Pai Huang, H. Shieh","doi":"10.1109/ISM.2013.100","DOIUrl":"https://doi.org/10.1109/ISM.2013.100","url":null,"abstract":"This paper investigated the potential of the real-time implementation in single image super resolution using edge directed unsharp masking sharpening (EDUMS) method. To achieve efficient real-time implementation with unsharp masking sharpening, the resolution enhancement process needed only simply filtering operations without iterations. Also, with edge directed information as the prior of the unsharp masking sharpening method, the jaggy artifact was efficiently suppressed. Clear edge structures and vivid details of high resolution images with minimum artifacts were presented by the proposed method.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"65 1","pages":"508-509"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73364173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Recognition of Action in Broadcast Basketball Videos on the Basis of Global and Local Pairwise Representation 基于全局和局部成对表示的转播篮球视频动作识别
Masaki Takahashi, M. Naemura, Mahito Fujii, J. Little
{"title":"Recognition of Action in Broadcast Basketball Videos on the Basis of Global and Local Pairwise Representation","authors":"Masaki Takahashi, M. Naemura, Mahito Fujii, J. Little","doi":"10.1109/ISM.2013.32","DOIUrl":"https://doi.org/10.1109/ISM.2013.32","url":null,"abstract":"A new feature-representation method for recognizing actions in broadcast videos, which focuses on the relationship between human actions and camera motions, is proposed. With this method, key point trajectories are extracted as motion features in spatio-temporal sub-regions called \"spatio-temporal multiscale bags\" (STMBs). Global representations and local representations from one sub-region in the STMBs are then combined to create a \"glocal pair wise representation\" (GPR). The GPR considers the co-occurrence of camera motions and human actions. Finally, two-stage SVM classifiers are trained with STMB-based GPRs, and specified human actions in video sequences are identified. It was experimentally confirmed that the proposed method can robustly detect specific human actions in broadcast basketball videos.","PeriodicalId":6311,"journal":{"name":"2013 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB)","volume":"37 1","pages":"147-154"},"PeriodicalIF":0.0,"publicationDate":"2013-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79733423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信