2002 IEEE Workshop on Multimedia Signal Processing.最新文献

筛选
英文 中文
Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV 基于麦克风阵列和卡尔曼滤波的会话电视语音识别前端系统
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203297
M. Fujimoto, Y. Ariki
{"title":"Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV","authors":"M. Fujimoto, Y. Ariki","doi":"10.1109/MMSP.2002.1203297","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203297","url":null,"abstract":"In this paper, we investigate hands-free speech recognition as front-end system of conversational TV. The conversational TV is one of machine conversation systems to retrieve the interesting information by inquiring it to the TV. To realize the natural machine conversation without consciousness of microphone, hands-free speech recognition is required. In the hands-free speech recognition system, the directions of the arriving signal are estimated by using a microphone array and the desired signal is enhanced by beam forming. Then, the user utterance section is detected automatically from continuously observed signal. Furthermore, by applying the noise reduction and noise adaptation, the enhanced speech signal is recognized accurately.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124724529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Video retrieval using an adaptive video indexing technique and automatic relevance feedback 视频检索采用自适应视频索引技术和自动关联反馈
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203286
P. Muneesawang, L. Guan
{"title":"Video retrieval using an adaptive video indexing technique and automatic relevance feedback","authors":"P. Muneesawang, L. Guan","doi":"10.1109/MMSP.2002.1203286","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203286","url":null,"abstract":"This work demonstrates content-based retrieval techniques for video databases using an adaptive video indexing (AVI) and a neural network model. The AVI utilizes a \"template frequency model\" for embedding spatial-temporal contents which are a key in characterizing the time-varying nature of video. This model can naturally be adopted to characterize video at various levels from shot, group, and story levels, in order to facilitate a multiple-level access video database. The AVI retrieval system achieves excellent retrieval accuracy, substantially higher than that of the key-frame based video indexing (KFVI), a popular benchmark for video retrieval. Furthermore, AVI structure can be integrated to a specialized neural network model to perform automatic relevance feedback retrieval. This offers advantages both in minimizing human-user involvement, and in considerably enhancing retrieval accuracy in the context of adaptive systems.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127779338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Wide baseline image registration using prior information 使用先验信息的宽基线图像配准
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203242
A. Roy-Chowdhury, R. Chellappa, T. Keaton
{"title":"Wide baseline image registration using prior information","authors":"A. Roy-Chowdhury, R. Chellappa, T. Keaton","doi":"10.1109/MMSP.2002.1203242","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203242","url":null,"abstract":"Establishing correspondence between features in two images of the same scene taken from different viewing angles in a challenging problem in image processing and computer vision. However, its solution is an important step in many applications like wide baseline stereo, 3D model alignment, creation of panoramic views etc. In this paper, we propose a technique for registration of two images of a face obtained from different viewing angles. We show that prior information about the general characteristics of a face obtained from video sequences of different faces can be used to design a robust correspondence algorithm. The method works by matching 2D shapes of the different features of the face. A doubly stochastic matrix, representing the probability of match between the features, is derived using the Sinkhorn normalization procedure. The final correspondence is obtained by minimizing the probability of error of a match between the entire constellations of features in the two sets, thus taking into account the global spatial configuration of the features. The method is applied for creating holistic 3D models of a face from partial representations. Although this paper focuses primarily on faces, the algorithm can also be used for other objects with small modifications.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"169 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121039168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Hidden Markov model for automatic transcription of MIDI signals MIDI信号自动转录的隐马尔可夫模型
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203337
Haruto Takeda, N. Saito, Tomoshi Otsuki, M. Nakai, H. Shimodaira, S. Sagayama
{"title":"Hidden Markov model for automatic transcription of MIDI signals","authors":"Haruto Takeda, N. Saito, Tomoshi Otsuki, M. Nakai, H. Shimodaira, S. Sagayama","doi":"10.1109/MMSP.2002.1203337","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203337","url":null,"abstract":"This paper describes a Hidden Markov Model (HMM)-based method of automatic transcription of MIDI (Musical Instrument Digital Interface) signals of performed music. The problem is formulated as recognition of a given sequence of fluctuating note durations to find the most likely intended note sequence utilizing the modern continuous speech recognition technique. Combining a stochastic model of deviating note durations and a stochastic grammar representing possible sequences of notes, the maximum likelihood estimate of the note sequence is searched in terms of Viterbi algorithm. The same principle is successfully applied to a joint problem of bar line allocation, time measure recognition, and tempo estimation. Finally, durations of consecutive /spl eta/n notes are combined to form a \"rhythm vector\" representing tempo-free relative durations of the notes and treated in the same framework. Significant improvements compared with conventional \"quantization\" techniques are shown.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131571210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Musical query-by-description as a multiclass learning problem 基于描述的音乐查询作为一个多类学习问题
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203270
B. Whitman, R. Rifkin
{"title":"Musical query-by-description as a multiclass learning problem","authors":"B. Whitman, R. Rifkin","doi":"10.1109/MMSP.2002.1203270","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203270","url":null,"abstract":"We present the query-by-description (QBD) component of \"Kandem\", a time-aware music retrieval system. The QBD system we describe learns a relation between descriptive text concerning a musical artist and their actual acoustic output, making such queries as \"Play me something loud with an electronic beat\" possible by merely analyzing the audio content of a database. We show a novel machine learning technique based on regularized least-squares classification (RLSC) that can quickly and efficiently learn the non-linear relation between descriptive language and audio features by treating the problem as a large number of possible output classes linked to the same set or input features. We show how the RLSC training can easily eliminate irrelevant labels.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130825233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Eyeball Video Communications Platform 眼球视频通信平台
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203329
J. Vass, Shahadat Khan
{"title":"Eyeball Video Communications Platform","authors":"J. Vass, Shahadat Khan","doi":"10.1109/MMSP.2002.1203329","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203329","url":null,"abstract":"Eyeball Video Communications Platform (VCP) provides a comprehensive solution for video communications, instant messaging, remote collaboration and application development. Eyeball VCP supports one-to-one and many-to-many video communications and collaboration utilizing peer-to-peer data transport without employing any reflector service. This structure is not only cost effective but also provides minimal delay. Eyeball VCP is based on two key technologies: Eyeball Any-Bandwidth Technology and Eyeball Any-Firewall Technology . Eyeball Any-Bandwidth Technology guarantees the best possible audio-video quality for broadband, narrowband and wireless connections. Eyeball Any-Firewall Technology ensures that media can pass through both corporate and personal firewalls with minimal configuration without compromising security. Eyeball VCP is targeted for the following markets: application developers, Internet and communications service providers, and the medium and large enterprises. Eyeball VCP won several industry awards including Best of Show Award in Internet World, Product of the Year from the Communications ASP Magazine and The Editor's Choice Award from the Internet Telephony Magazine.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123807932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Video quality objective metric using data hiding 利用数据隐藏实现视频质量的客观度量
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203346
Mylène C. Q. Farias, S. Mitra, M. Carli
{"title":"Video quality objective metric using data hiding","authors":"Mylène C. Q. Farias, S. Mitra, M. Carli","doi":"10.1109/MMSP.2002.1203346","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203346","url":null,"abstract":"In this paper a non-reference objective video quality metric is proposed. The quality metric is obtained by means of a non-conventional use of data hiding technique. Test data are embedded in an MPEG-2 video; the basic assumption is that the data embedded undergo under the same degradation as the host video. To analyze the performance of the system, a comparison between the results obtained using this metric and the perceived mean annoyance values was performed. The annoyance values were obtained through a psychophysical experiment, which measured the threshold and mean annoyance values of compressed videos.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"113 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121042272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A new switch scheduling algorithm to improve QoS in the multimedia router 一种提高多媒体路由器QoS的交换机调度算法
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203324
María Blanca Caminero, C. Carrión, F. Quiles, J. Duato, S. Yalamanchili
{"title":"A new switch scheduling algorithm to improve QoS in the multimedia router","authors":"María Blanca Caminero, C. Carrión, F. Quiles, J. Duato, S. Yalamanchili","doi":"10.1109/MMSP.2002.1203324","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203324","url":null,"abstract":"The multimedia router (MMR) is aimed at providing QoS to multimedia flows, which coexist with conventional best-effort traffic, by means of a single-chip, compact router designed for cluster and local area environments. As the router is based on a multiplexed crossbar, hardware efficient link and switch scheduling algorithms are needed. Their goal is to achieve a high utilization, while the QoS needed by the multimedia connections is guaranteed. This work presents a novel switch scheduling algorithm, the candidate conflict arbiter (CCA), that can be efficiently implemented in the MMR. Simulation results show that this proposal beats other previous algorithms in terms of maximum throughput achieved while still providing QoS to the multimedia flows.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121217614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Combining stereo and visual hull information for on-line reconstruction and rendering of dynamic scenes 结合立体和视觉船体信息进行动态场景的在线重建和渲染
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203235
Ming Li, H. Schirmacher, M. Magnor, H. Seidel
{"title":"Combining stereo and visual hull information for on-line reconstruction and rendering of dynamic scenes","authors":"Ming Li, H. Schirmacher, M. Magnor, H. Seidel","doi":"10.1109/MMSP.2002.1203235","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203235","url":null,"abstract":"In this paper, we present a novel system which, combines depth-from-stereo and visual hull reconstruction for acquiring dynamic real-world scenes at interactive rates. First, we use the silhouettes from multiple views to construct a polyhedral visual hull is then used to limit the disparity range during depth-from-stereo computation. The restricted search range improves both speed and quality of the stereo reconstruction. In return, stereo information can compensate for some of the visual hull method, such as inability to reconstruct surface details and concave regions. Our system achieves a reconstruction frame rate of 4fps.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128684668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
An edge and texture preserving algorithm for video error concealment 用于视频错误隐藏的边缘和纹理保护算法
2002 IEEE Workshop on Multimedia Signal Processing. Pub Date : 2002-12-09 DOI: 10.1109/MMSP.2002.1203263
S. Belfiore, Marco Grangetto, E. Magli, G. Olmo
{"title":"An edge and texture preserving algorithm for video error concealment","authors":"S. Belfiore, Marco Grangetto, E. Magli, G. Olmo","doi":"10.1109/MMSP.2002.1203263","DOIUrl":"https://doi.org/10.1109/MMSP.2002.1203263","url":null,"abstract":"We present a novel error concealment algorithm for block-based video transmission over error-prone networks. We develop a spatial error concealment technique, which combines edge-preserving interpolation and texture analysis and synthesis, providing a reconstruction of lost macroblocks optimized for visual perception. In particular, the algorithm recovers image edges by coarse-to-fine MAP estimation with a Markov random field prior, and replenishes lost textured areas with a texture synthesized from neighboring macroblocks. Experimental results show that texture synthesis allows achieving improved visual quality of the reconstructed area with respect to other state-of-the-art spatial concealment techniques.","PeriodicalId":398813,"journal":{"name":"2002 IEEE Workshop on Multimedia Signal Processing.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126986143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信