[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing最新文献

筛选
英文 中文
Design of signal-subspace cost functionals for parameter estimation 用于参数估计的信号子空间代价函数的设计
W. Xu, M. Kaveh
{"title":"Design of signal-subspace cost functionals for parameter estimation","authors":"W. Xu, M. Kaveh","doi":"10.1109/ICASSP.1992.226597","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226597","url":null,"abstract":"A probabilistic approach to the quantification of the resolving ability of a general class of MUSIC type estimators (m-estimators) is presented. Based on a resolution-maximizing criterion of optimality, a cost functional is designed for a specific parametric subclass of m-estimators. An effective data-adaptive value for the parametric class is also presented and the results are generalized to a broader nonparametric subclass.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125874109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Signal reconstruction from windowed Fourier phase 窗口傅里叶相位信号重建
J. Weng
{"title":"Signal reconstruction from windowed Fourier phase","authors":"J. Weng","doi":"10.1109/ICASSP.1992.226465","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226465","url":null,"abstract":"A variety of information relating to phase has been increasingly widely used for stereo matching as well as motion image matching. In particular, the windowed Fourier phase (WFP) has several properties that are important for representing the structure of the signal. It is established that a series of signals, either continuous or discrete, is determined up to a multiplicative constant by its WFP at any frequency. An algorithm is developed to reconstruct the signals from the WFP.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125939597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Handwritten word recognition using HMM with adaptive length Viterbi algorithm 基于自适应长度Viterbi算法的HMM手写单词识别
Ying He, M.-Y. Chen, A. Kundu
{"title":"Handwritten word recognition using HMM with adaptive length Viterbi algorithm","authors":"Ying He, M.-Y. Chen, A. Kundu","doi":"10.1109/ICASSP.1992.226253","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226253","url":null,"abstract":"The authors have developed a handwritten word recognition scheme based on a single contextual, discrete symbol probability hidden Markov model (HMM) incorporated with an adaptive length Viterbi algorithm. This work attempts to extend the earlier HMM scheme for naturally segmented word recognition to cursive and nonsegmented word recognition. The algorithm presegments the script into characters and/or fractions of characters, dynamically selects the correct segmentation points, determines the word length, and recognizes the word according to the maximum path probability. The HMM is on top of, but independent of, script segmentation and character recognition techniques, and therefore leaves room for further improvement. The experiments have shown promising results and directions for further improvement.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124805842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
A first study on neural net based generation of prosodic and spectral information for Mandarin text-to-speech 基于神经网络的汉语文本-语音韵律与谱信息生成研究
Sin-Horng Chen, Shaw-Hwa Hwang, Chun-Yu Tsai
{"title":"A first study on neural net based generation of prosodic and spectral information for Mandarin text-to-speech","authors":"Sin-Horng Chen, Shaw-Hwa Hwang, Chun-Yu Tsai","doi":"10.1109/ICASSP.1992.226124","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226124","url":null,"abstract":"A neural-network-based approach to generating prosodic and spectral information of syllables for Mandarin text-to-speech synthesis is studied. Some contextual features are first extracted from a given input text by text analysis and taken as input signals for synthesis. Then, six multilayer perceptrons are employed to generate pause duration, syllable duration, and pitch mean and shape of one- and two-syllable synthesis units, several reproduction templates of proper size are first generated for each synthesis unit of syllable approach. The objective is to generate spectral patterns of the syllable that can be directly concatenated to synthesize natural speech without further modification. The validity of this novel approach was examined by simulation using a database of sentential utterances recorded from TV news, reported by a single female announcer. Experimental results confirmed that this is a promising approach for Mandarin text-to-speech synthesis.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129716245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Robust automatic time alignment of orthographic transcriptions with unconstrained speech 鲁棒自动时间对齐正字法转录与不受约束的语音
B. Wheatley, G. Doddington, Charles T. Hemphill, J. Godfrey, E. Holliman, Jane McDaniel, Drew Fisher
{"title":"Robust automatic time alignment of orthographic transcriptions with unconstrained speech","authors":"B. Wheatley, G. Doddington, Charles T. Hemphill, J. Godfrey, E. Holliman, Jane McDaniel, Drew Fisher","doi":"10.1109/ICASSP.1992.225853","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.225853","url":null,"abstract":"A method for automatic time alignment of orthographically transcribed speech using supervised speaker-independent automatic speech recognition based on the orthographic transcription, an online dictionary, and HMM phone models is presented. This method successfully aligns transcriptions with speech in unconstrained 5 to 10 min conversations collected over long-distance telephone lines. It requires minimal manual processing and generally produces correct alignments despite the challenging nature of the data. The robustness and efficiency of the method make it a practical tool for very large speech corpora.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128323537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Butterfly orthogonal structure for fast transforms, filter banks and wavelets 蝶形正交结构用于快速变换、滤波器组和小波
A. Drygajlo
{"title":"Butterfly orthogonal structure for fast transforms, filter banks and wavelets","authors":"A. Drygajlo","doi":"10.1109/ICASSP.1992.226653","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226653","url":null,"abstract":"Spectral analysis/synthesis ideas that are common for orthogonal transforms, multichannel and multirate filtering, and wavelet transforms are discussed and generalized. Some recently developed unconventional applications of the butterfly orthogonal decomposition technique are reviewed and its usefulness in developing efficient multiresolution digital signal processing systems is discussed. A generalized multirate filtering structure is developed that is based on fast algorithms of orthogonal transforms and their orthogonal subtransforms. In particular the structural subband decomposition of a discrete signal in sequency and frequency spectral domains is given. A generalized butterfly tree structure with all-pass branches and arbitrary weighting constants as well as its multilevel filter application is discussed. Wavelet filter bank realizations appear as a subset of presented structures.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Excitation modeling based on speech residual information 基于语音残差信息的激励建模
P. Lupini, V. Cuperman
{"title":"Excitation modeling based on speech residual information","authors":"P. Lupini, V. Cuperman","doi":"10.1109/ICASSP.1992.225904","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.225904","url":null,"abstract":"Speech codecs based on code excited linear prediction (CELP) traditionally use an adaptive short-term filter, an adaptive codebook (long-term filter), and a fixed (stochastic) excitation codebook. The authors examined the possibility of replacing the fixed stochastic codebook by an adaptive codebook with adaptation based on the characteristics of the unquantized residual. In a typical 4-kb/s CELP codec, the authors use the spectral magnitude and phase of the unquantized residual to experimentally estimate an upper bound on the performance improvement which could be obtained by excitation codebook adaptation. The results suggest that adaptation methods based only on the spectral magnitude (including fractal-based codebooks) are unlikely to result in significant improvement. Adaptation based on the spectral phase information, on the other, shows a significant potential for improving CELP speech quality. The authors also present results of a preliminary test designed to investigate the effect of quantization noise on phased-based adaptation of excitation codebooks.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128624659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Image restoration using neural networks 利用神经网络进行图像恢复
M. Figueiredo, J. Leitão
{"title":"Image restoration using neural networks","authors":"M. Figueiredo, J. Leitão","doi":"10.1109/ICASSP.1992.226033","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226033","url":null,"abstract":"Two neural algorithms for image restoration are proposed. The image is considered degraded by linear blur and additive white Gaussian noise. Maximum a posteriori estimation and regularization theory applied to this problem lead to the same high dimension optimization problem. The developed schemes, one having a sequential updating schedule and the other being fully parallel, implement iterative minimization algorithms which are proved to converge. The robustness of these algorithms with respect to finite numerical precision is studied. Examples with real images are presented.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129375871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Linear discriminant analysis for improved large vocabulary continuous speech recognition 改进大词汇量连续语音识别的线性判别分析
Reinhold Häb-Umbach, H. Ney
{"title":"Linear discriminant analysis for improved large vocabulary continuous speech recognition","authors":"Reinhold Häb-Umbach, H. Ney","doi":"10.1109/ICASSP.1992.225984","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.225984","url":null,"abstract":"The interaction of linear discriminant analysis (LDA) and a modeling approach using continuous Laplacian mixture density HMM is studied experimentally. The largest improvements in speech recognition could be obtained when the classes for the LDA transform were defined to be sub-phone units. On a 12000 word German recognition task with small overlap between training and test vocabulary a reduction in error rate by one-fifth was achieved compared to the case without LDA. On the development set of the DARPA RM1 task the error rate was reduced by one-third. For the DARPA speaker-dependent no-grammar case, the error rate averaged over 12 speakers was 9.9%. This was achieved with a recognizer using LDA and a set of only 47 Viterbi-trained context-independent phonemes.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129533232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 384
Bidirectional motion estimation based on P frame motion vectors and area overlap 基于P帧运动矢量和面积重叠的双向运动估计
W. Lynch
{"title":"Bidirectional motion estimation based on P frame motion vectors and area overlap","authors":"W. Lynch","doi":"10.1109/ICASSP.1992.226180","DOIUrl":"https://doi.org/10.1109/ICASSP.1992.226180","url":null,"abstract":"Some proposed video compression schemes do not send frames in the order they were captured. Such schemes yield bidirectional or B frames: frames that are predicted from past and future frames. When estimating motion vectors for B frames the motion vector field referencing the future to the past frame (the P frame motion vector field) is available. The area overlap (AO) method presented estimates the motion vector fields relating a B frame to past and future frames using the P frame motion vector field. Thus B frame motion vectors need not be sent AO's estimates are scaled versions of P frame motion vectors and therefore have finer resolution.<<ETX>>","PeriodicalId":163713,"journal":{"name":"[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129798731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信