2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).最新文献

筛选
英文 中文
A missing feature approach to instrument identification in polyphonic music 复调音乐中乐器识别的缺失特征方法
J. Eggink, Guy J. Brown
{"title":"A missing feature approach to instrument identification in polyphonic music","authors":"J. Eggink, Guy J. Brown","doi":"10.1109/ICASSP.2003.1200029","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1200029","url":null,"abstract":"Gaussian mixture model (GMM) classifiers have been shown to give good instrument recognition performance for monophonic music played by a single instrument. However, many applications (such as automatic music transcription) require instrument identification from polyphonic, multi-instrumental recordings. We address this problem by incorporating ideas from missing feature theory into a GMM classifier. Specifically, frequency regions that are dominated by energy from an interfering tone are marked as unreliable and excluded from the classification process. This approach has been evaluated on random two-tone chords and an excerpt from a commercially available compact disc, with promising results.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115182297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Language modeling and transcription of the TED corpus lectures TED语料库讲座的语言建模和转录
Erwin Leeuwis, Marcello Federico, M. Cettolo
{"title":"Language modeling and transcription of the TED corpus lectures","authors":"Erwin Leeuwis, Marcello Federico, M. Cettolo","doi":"10.1109/ICASSP.2003.1198760","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1198760","url":null,"abstract":"Transcribing lectures is a challenging task, both in acoustic and in language modeling. In this work, we present our first results on the automatic transcription of lectures from the TED corpus, recently released by ELRA and LDC. In particular, we concentrated our effort on language modeling. Baseline acoustic and language models were developed using respectively 8 hours of TED transcripts and various types of texts: conference proceedings, lecture transcripts, and conversational speech transcripts. Then, adaptation of the language model to single speakers was investigated by exploiting different kinds of information: automatic transcripts of the talk, the title of the talk, the abstract and, finally, the paper. In the last case, a 39.2% WER was achieved.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116069769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 70
Differential learning and random walk model 差分学习和随机漫步模型
Seungjin Choi
{"title":"Differential learning and random walk model","authors":"Seungjin Choi","doi":"10.1109/ICASSP.2003.1202468","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202468","url":null,"abstract":"This paper presents a learning algorithm for differential decorrelation, the goal of which is to find a linear transform that minimizes the concurrent change of associated output nodes. First the algorithm is derived from the minimization of the objective function which measures the differential correlation. Then we show that the differential decorrelation learning algorithm can also be derived in the framework of maximum likelihood estimation of a linear generative model with assuming a random walk model for latent variables. Algorithm derivation and local stability analysis are given with a simple numerical example.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116127660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A note on state estimation as a convex optimization problem 状态估计是一个凸优化问题
Thomas Bo Schön, F. Gustafsson, A. Hansson
{"title":"A note on state estimation as a convex optimization problem","authors":"Thomas Bo Schön, F. Gustafsson, A. Hansson","doi":"10.1109/ICASSP.2003.1201618","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201618","url":null,"abstract":"The Kalman filter computes the maximum a posteriori (MAP) estimate of the states for linear state space models with Gaussian noise. We interpret the Kalman filter as the solution to a convex optimization problem, and show that we can generalize the MAP state estimator to any noise with a log-concave density function and any combination of linear equality and convex inequality constraints on the states. We illustrate the principle on a hidden Markov model, where the state vector contains probabilities that are positive and sum to one.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116128008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Motion field discontinuity classification for tensor-based optical flow estimation 基于张量光流估计的运动场不连续分类
Hai-Yun Wang, K. Ma
{"title":"Motion field discontinuity classification for tensor-based optical flow estimation","authors":"Hai-Yun Wang, K. Ma","doi":"10.1109/ICASSP.2003.1199562","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1199562","url":null,"abstract":"A much more accurate classification scheme is proposed for structure tensor-based optical flow estimation to address the difficulties of interpreting motion field discontinuities. The key novelties of this approach are: (1) a scale-adaptive spatio-temporal filter; (2) a weighted structure tensor; (3) confidence measurements. Multiple motions of moving objects are matched by utilizing a spatio-temporal Gaussian filter with adaptive scale selection, which is steered by the condition number. To capture the neighborhood structure of local discontinuities, weighting the structure tensors is attempted. A new normalization function is exploited to facilitate accurate thresholding for confidence measurements. Experimental results demonstrate that these three novelties together effectively contribute much improved performance on motion field discontinuity classification compared with that of existing methods.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122357513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Channel equalization and the Bayes point machine 信道均衡和贝叶斯点机
E. Harrington, Jyrki Kivinen, R. C. Williamson
{"title":"Channel equalization and the Bayes point machine","authors":"E. Harrington, Jyrki Kivinen, R. C. Williamson","doi":"10.1109/ICASSP.2003.1202687","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202687","url":null,"abstract":"Equalizers trained with a large margin have an ability to better handle noise in unseen data and drift in the target solution. We present a method of approximating the Bayes optimal strategy which provides a large margin equalizer, the Bayes point equalizer. The method we use to estimate the Bayes point is to average N equalizers that are run on independently chosen subsets of the data. To better estimate the Bayes point we investigated two methods to create diversity amongst the N equalizers. We show experimentally that the Bayes point equalizer for appropriately large step sizes offers improvement on LMS and LMA in the presence of channel noise and training sequence errors. This allows for shorter training sequences albeit with higher computational requirements.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122820791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Transient signal detection using overcomplete wavelet transform and high-order statistics 利用过完备小波变换和高阶统计量进行瞬态信号检测
C. Ioana, A. Quinquis
{"title":"Transient signal detection using overcomplete wavelet transform and high-order statistics","authors":"C. Ioana, A. Quinquis","doi":"10.1109/ICASSP.2003.1201715","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201715","url":null,"abstract":"We consider the problem of transient signal detection, followed by a virtual characterization stage. There are two main difficulties which appear in this field. The first one is due to the noise which acts in a real environment. Secondly, when we are interested in signal characterization, it is important to provide more complete information about its time-frequency behavior. Consequently, we propose an adaptive time-frequency method based on the overcomplete wavelet transform concept, in which case an irregular sampling procedure is involved. This procedure uses a method based on the fourth order moment, applied for each sub-band, in order to establish the optimal weight for each sample. The results obtained for real data prove the capability of the proposed approach to detect a transient signal accurately, compared with some classical methods (spectrogram or standard wavelet transform, for example).","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122937380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Blind identification of MIMO systems by a system to HOS based inverse filter relationship 基于系统与HOS逆滤波关系的MIMO系统盲识别
Chong-Yung Chi, Ching-Yung Chen, Chii-Horng Chen
{"title":"Blind identification of MIMO systems by a system to HOS based inverse filter relationship","authors":"Chong-Yung Chi, Ching-Yung Chen, Chii-Horng Chen","doi":"10.1109/ICASSP.2003.1202635","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1202635","url":null,"abstract":"Higher-order statistics based inverse filter criteria (HOS-IFC) proposed by Tugnait (1997) and Chi et al. (2002) have been widely used for blind identification and deconvolution of multiple-input multiple-output (MIMO) linear time-invariant systems with a set of nonGaussian measurements. Based on a relationship, that holds true for finite signal-to-noise ratio, between the optimum inverse filter associated with the HOS-IFC and the unknown MIMO system, an iterative FFT-based blind system identification (BSI) algorithm for MIMO systems is proposed in this paper, for which common subchannel zeros are allowed and the system order information is never needed, and meanwhile its performance is superior to the performance of Tugnait's HOS-IFC approach. Some simulation results are presented to support the efficacy of the proposed BSI algorithm.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114271480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Stereoscopic panoramic video generation using centro-circular projection technique 利用中心圆投影技术生成立体全景视频
C. Weerasinghe, W. Li, P. Ogunbona
{"title":"Stereoscopic panoramic video generation using centro-circular projection technique","authors":"C. Weerasinghe, W. Li, P. Ogunbona","doi":"10.1109/ICASSP.2003.1199514","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1199514","url":null,"abstract":"The paper presents a method of stereoscopic panoramic video generation including techniques for panorama projection, stitching and calibration for various depth planes. The methods described can be used on video sequences captured by an arrangement of multiple pairs of cameras or multiple stereoscopic cameras mounted on a regular polygonal shaped camera rig. Algorithms can also be used in combination or separately, for generating both stereoscopic and monoscopic video and still panoramas.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114413108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Noise variance in signal denoising 信号去噪中的噪声方差
S. Beheshti, M. Dahleh
{"title":"Noise variance in signal denoising","authors":"S. Beheshti, M. Dahleh","doi":"10.1109/ICASSP.2003.1201649","DOIUrl":"https://doi.org/10.1109/ICASSP.2003.1201649","url":null,"abstract":"In the thresholding method of denoising the optimum threshold is obtained as a function of additive noise variance. In practical problems, where the variance of the noise is unknown, the first step is to estimate the noise variance. The estimated noise variance is then implemented in calculation of the optimum threshold. The current available methods of variance estimation are heuristic. Here, we provide a new method for estimation of the additive noise variance. The method is derived from a new denoising method which is proposed in Beheshti et al. (2002). Unlike thresholding approaches the denoising method in Beheshti is based on comparison of subspaces of the basis. It compares a defined description length (DL) of the noisy data in the subspaces. We show how the estimation of the noise variance and the denoising process can be done simultaneously.","PeriodicalId":104473,"journal":{"name":"2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114454661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信