{"title":"Hidden Markov model speech recognition based on Kalman filtering","authors":"M. Clements, Sungjae Lim","doi":"10.1109/ICASSP.1987.1169800","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169800","url":null,"abstract":"Traditional hidden Markov model speech recognition is generally based on a set of parameters (often LPC related) which are extracted at discrete intervals. Such an analysis necessitates use of a discrete-trial hidden Markov model in which the underlying states can only change at intervals related to the frame rate of the analysis. The exact locations of the analysis windows used can influence the front-end outputs and as a result can cause confusion between words differing in short-duration consonants. In the current study, an alternate method which does not require segmentation is proposed, and a simple version is implemented. The discrete trial hidden Markov model algorithms are adapted to this framework leading to significantly improved recognition performance.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121946770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vector quantization firmware for an acoustical front-end using the TMS32020","authors":"A. Ciaramella, G. Venuti","doi":"10.1109/ICASSP.1987.1169338","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169338","url":null,"abstract":"Here we describe the firmware implementation of an acoustical front-end, performing the vector quantization of Discrete Cosine Transform (DCT) for a speech recognition system. This firmware runs on a single TMS32020 signal processor chip and is characterized both by a substantial real time performance and by a good accuracy.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"268 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116049296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Constrained total least squares","authors":"T. Abatzoglou, J. Mendel","doi":"10.1109/ICASSP.1987.1169438","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169438","url":null,"abstract":"The Total Least Squares (TLS) method is a generalized least square technique to solve an overdetermined system of equationsAxsimeqb. The TLS solution differs from the usual Least Square (LS) in that it tries to compensate for arbitrary noise present in bothAandb. In certain problems the noise perturbations ofAandbare linear functions of a common \"noise source\" vector. In this case we obtain a generalization of the TLS criterion called the Constrained Total Least Squares (CTLS) method by taking into account the linear dependence of the noise terms inAandb. If the noise columns ofAandbare linearly related then the CTLS solution is obtained in terms of the largest eigenvalue and corresponding eigenvector of a certain matrix. The CTLS technique can be applied to problems like Maximum Likelihood Signal Parameter Estimation, Frequency Estimation of Sinusoids in white or colored noise by Linear Prediction and others.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"107 1-2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120896107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A phonetic transcription system of Arabic text","authors":"Hany Selim, Taghrid Anbar","doi":"10.1109/ICASSP.1987.1169472","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169472","url":null,"abstract":"Within the framework of developing an unlimited vocabulary speech synthesizer a rule based transcription system has been developed. It requires as an input a partly diacritized Arabic text. The inventory of the output sounds consists of 24 consonants and 16 vocalic allophones. Three kinds of word stress are defined; main stress for the lexeme, secondary for an eventually unstressed long vowel, and tertiary for the suffix cluster. The locations of the first and third stress types are determined, after word syllabification and a tree search of suffix(es), applying the well established stress rules in Arabic. A relatively small dictionary is used for words with exceptional pronounciation. The transcription system is coded in Pascal language consisting of some 950 instructions. On the avarege a processing speed of 65 word/sec is achieved on an IBM-PC-AT.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126593564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Polyphase filter matrix for rational sampling rate conversions","authors":"C. Hsiao","doi":"10.1109/ICASSP.1987.1169404","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169404","url":null,"abstract":"The polyphase filter array has been used for efficient implementations of filters with integer sampling rate conversions. [1] The filter in the high sampling rate side is decomposed into its polyphase filters which can be moved to the lower sampling rate side without changing their functions. For FIR filters the computational complexity is reduced by a factor equal to the sampling rate ratio. A rational (L/M) sampling rate conversion system realized with a 1-to-L interpolator followed by an M-to-1 decimator has three sampling rates F, LF and (L/M)F involved. By using the polyphase filter array a filter operating at the sampling rate of LF can be implemented in either the input side or the output side with lower sampling rates. The polyphase filter matrix structure will operate at the sampling rate of F/M, which does not show in the above model and is lower than any one of those three rates. For FIR filters the computational complexity is reduced by a factor of LM compared to the direct realization of the integral filter or by a factor of M (or L) compared to the polyphase filter array realization while the system input-output relation is maintained.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121535174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Narrow-band jammer suppression using an adaptive lattice filter","authors":"G. Saulnier, K. Yum, P. Das","doi":"10.1109/ICASSP.1987.1169398","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169398","url":null,"abstract":"It is well known that least-mean-squared (LMS) adaptive filtering can be used to suppress narrow-band jammers in a direct sequence spread spectrum receiver. To date, much of the work in this area has concentrated on the use of transversal adaptive filter structures. This paper demonstrates the performance of an adaptive lattice filter in this application. The reflection coefficients of the filter are adjusted using a gradient descent algorithm in an effort to minimize the mean-squared error and, in the process, suppress narrow-band interference. Experimental bit-error-rate performance curves are presented for 7- and 31- chip spreading sequences with a single tone jammer.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128380731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Subband/Transform coding using filter bank designs based on time domain aliasing cancellation","authors":"J. Princen, A. Johnson, A. B. Bradley","doi":"10.1109/ICASSP.1987.1169405","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169405","url":null,"abstract":"A new, oddly stacked, critically sampled, single side-band (SSB) [7] analysis/synthesis system based on Time Domain Aliasing Cancellation (TDAC) [1],[2] is described in this paper. The specifications for the analysis and synthesis filter responses are developed and a number of designs which satisfy the reconstruction requirements are described. The application of TDAC systems to Subband/Transform coding is also discussed and the objective performance of a 32 band coder using several different window designs is presented and compared with a coder based on Frequency Domain Aliasing Cancellation (FDAC) filter banks [3]-[5].","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133087462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Solving the semi-definite generalized eigenvalue problem with application to ESPRIT","authors":"M. Zoltowski","doi":"10.1109/ICASSP.1987.1169419","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169419","url":null,"abstract":"Two methods are developed for computing the generalized eigenvalues and eigenvectors associated with the matrix pair{A,B}, where A and B are singular hermitian matrices with the range space of A a subset of the range space of B. Conventional methods of solution break down for this case. An application is described based on the recently reported Estimation of Signal Parameters by Rotational Invariance Techniques (ESPRIT) algorithm. ESPRIT possesses remarkable advantages over other high-resolution direction-of-arrival estimation algorithms in terms of speed, storage, and indifference to array calibration.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"334 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122439499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experimental evaluation of duration modelling techniques for automatic speech recognition","authors":"M. Russell, Anneliese E. Cook","doi":"10.1109/ICASSP.1987.1169918","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169918","url":null,"abstract":"This paper presents an experimental evaluation of two such extensions: hidden semi-Markov models (HSMMs), and expanded state HMMs (ESHMMs). These extensions to the standard HMM (hiden Markov model) formalism permit improved duration modelling and experimental results are presented which show that they can consistently lead to improved performance. The results indicate that if sufficient training material is available, the best performance is obtained with the Fergusson model, but that with smaller training sets Poisson HSMMs or type B ESHMMs are more robust models.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"205 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Training of phoneme models in a sentence recognition system","authors":"A. Noll, H. Ney","doi":"10.1109/ICASSP.1987.1169444","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169444","url":null,"abstract":"This paper describes the training of phoneme models used in a speaker-dependent continuous-speech understanding system. Three different methods for estimating model parameters are described which are based on the standard Markov modelling approach. The first method deals with phoneme models with continuous-emission probability density functions. For the second and third method phoneme models with discrete probability-density functions and two different parameter-estimation methods are described. The test and training speech-database consists of two independent sets of spoken sentences of several speakers. The complete recognition vocabulary contains 917 words with an overlap of 51 words (e.g. articles) with the training vocabulary. Recognition results are given for the different training methods and some other experiments.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121166373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}