{"title":"Nonlinear recovery of sparse signals from narrowband data","authors":"R. Gopinath","doi":"10.1109/ICASSP.1995.480462","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480462","url":null,"abstract":"This paper describes the connection between a certain signal recovery problem and the decoding of Reed-Solomon codes. It is shown that any algorithm for decoding Reed-Solomon codes (over finite fields) can be used to recover wide-band signals (over the real/complex field) from narrow-band information. It also shows that a signal with at most N/sub t/ frequency samples can be recovered from any contiguous band of 2N/sub t/ frequency samples.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"34 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132404828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of sound pressure fields by Picard-iterative BEM based on holographic interferometry","authors":"H. Klingele, H. Steinbichler","doi":"10.1109/ICASSP.1995.480125","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480125","url":null,"abstract":"Holographic interferometry offers amplitude data with a high spatial resolution which can be used as vibration boundary condition for calculating the corresponding sound pressure field. When investigating objects with arbitrary 3D-shape this requires contour measuring, performing holographic interferometry for three axes of freedom, combining contour and vibration data into a boundary element (BE) model, and then solving the discretized Helmholtz-Kirchhoff integral equation for the surface sound pressure. The latter is done by means of the Picard-iterative boundary element method (PIBEM), which does not need matrix operations at all and such is capable of also treating large BE models arising from small bending wavelengths at high vibration frequencies. An experimental verification of this method by microphone measurements in an anechoic chamber is presented for a cylindrical object.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132745190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noisy speech recognition using robust inversion of hidden Markov models","authors":"S. Moon, Jenq-Neng Hwang","doi":"10.1109/ICASSP.1995.479385","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479385","url":null,"abstract":"The hidden Markov model (HMM) inversion algorithm is proposed and applied to robust speech recognition for general types of mismatched conditions. The Baum-Welch HMM inversion algorithm is a dual procedure to the Baum-Welch HMM reestimation algorithm, which is the most widely used speech recognition technique. The forward training of an HMM, based on the Baum-Welch reestimation, finds the model parameters /spl lambda/ that optimize some criterion, usually maximum likelihood (ML), with given speech inputs s. On the other hand, the inversion of a HMM finds speech inputs s that optimize some criterion with given model parameters /spl lambda/. The performance of the proposed HMM inversion, in conjunction with HMM reestimation, for robust speech recognition under additive noise corruption and microphone mismatch conditions is favorably compared with other noisy speech recognition techniques, such as the projection-based first-order cepstrum normalization (FOCN) and the robust minimax (MINIMAX) classification techniques.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133188921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discrete scale transform for signal analysis","authors":"E. J. Zalubas, W. J. Williams","doi":"10.1109/ICASSP.1995.479859","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479859","url":null,"abstract":"The scale transform introduced by Cohen (see IEEE Trans. Signal Processing, vo1.41, p.3275-3292, December 1993) is a special case of the Mellin transform. The scale transform has mathematical properties desirable for comparison of signals for which scale variation occurs. In addition to the scale invariance property of the Mellin transform many properties specific to the scale transform have been presented. A procedure is presented for complete implementation of the scale transformation for discrete signals. This complements discrete Mellin transforms and delineates steps whose implementation are specific to the scale transform.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128844798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supplementary orthogonal cepstral features","authors":"K. Assaleh","doi":"10.1109/ICASSP.1995.479609","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479609","url":null,"abstract":"A new set of LP-derived features is introduced. The concept of these features is motivated by the power sum formulation of the LP cepstrum. Due to the fact that the LP model implies that the resulting poles are either real or occur in complex conjugate pairs, the power sum of the poles is equivalent to the power sum of their real components. Therefore, the LP cepstrum is associated to the power sum of the real component of the LP poles. This fact is utilized in deriving a new set of features that is associated to the imaginary components of the LP poles. The author refers to this new set of features as the sepstral coefficients. It is found that the sepstral coefficients and cepstral coefficients are relatively uncorrelated. Hence, they can be used jointly to improve the performance of pattern classification applications where cepstral features are usually used. The author presents some preliminary results on speaker identification experiments.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128888203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Some results with a trainable speech translation and understanding system","authors":"Víctor M. Jiménez, A. Castellanos, E. Vidal","doi":"10.1109/ICASSP.1995.479286","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479286","url":null,"abstract":"The problems of limited-domain spoken language translation and understanding are considered. A standard continuous speech recognizer is extended for using automatically learnt finite-state transducers as translation models. Understanding is considered as a particular case of translation where the target language is a formal language. From the different approaches compared, the best results are obtained with a fully integrated approach, in which the input language acoustic and lexical models, and (N-gram) language models of input and output languages, are embedded into the learnt transducers. Optimal search through this global network obtains the best translation for a given input acoustic signal.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127849404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Uniqueness study of measurements obtainable with an electromagnetic vector sensor","authors":"Kah-Chye Tan, K. Ho, A. Nehorai","doi":"10.1109/ICASSP.1995.479928","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479928","url":null,"abstract":"We investigate the linear dependence of the steering vectors of one electromagnetic vector sensor. We show that every 3 steering vectors with distinct DOAs are linearly independent. We also show that 4 steering vectors with distinct DOAs are linearly independent if the ellipticity angles of the signals associated with any 2 of the 4 steering vectors are distinct. We then establish that 5 steering vectors are linearly independent if exactly 2 or 3 of them correspond to circularly polarized signals with the same spin direction. Finally, we demonstrate that given any 5 steering vectors, then for any DOA there exists a steering vector which is linearly dependent on the 5 steering vectors.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127863444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the choice of wavelet filters for audio compression","authors":"P. Philippe, F. M. D. Saint-Martin, L. Mainard","doi":"10.1109/ICASSP.1995.480413","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.480413","url":null,"abstract":"We address the issue of choosing an optimal wavelet packets transform for audio compression. We present a comparison method based on a perceptual approach, which provides an entropic bit-rate for \"transparent\" coding of a given audio signal. The test with different wavelets leads to the conclusion that the most significant synthesis criterion for audio compression is the so-called \"coding gain\", while frequency selectivity, regularity and orthogonality seem less relevant.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133799389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of acoustic-phonetic variations in fluent speech using TIMIT","authors":"Don X. Sun, L. Deng","doi":"10.1109/ICASSP.1995.479399","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479399","url":null,"abstract":"We propose a hierarchically structured analysis of variance (ANOVA) method to analyze, in a quantitative manner, the contributions of various identifiable factors to the overall acoustic variability exhibited in fluent speech data of TIMIT processed in the form of mel-frequency cepstral coefficients. The results of the analysis show that the greatest acoustic variability in TIMIT data is explained by the difference among distinct phonetic labels in TIMIT, followed by the phonetic context difference given a fixed phonetic label. The variability among sequential sub-segments within each TIMIT-defined phonetic segment is found to be significantly greater than the gender, dialect region, and speaker factors. Our results serve to provide useful insights to the understanding of the roles of various components of speech recognizers in contributing to the ultimate speech recognition performance.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133824018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Co-channel speaker separation","authors":"D. Morgan, E. George, L. Lee, Stephen M. Kay","doi":"10.1109/ICASSP.1995.479822","DOIUrl":"https://doi.org/10.1109/ICASSP.1995.479822","url":null,"abstract":"This paper describes a system for the automatic separation of two-talker co-channel speech. This system is based on a frame-by-frame speaker separation algorithm that exploits a pitch estimate of the stronger talker derived from the co-channel signal. The concept underlying this approach is to recover the stronger talker's speech by enhancing harmonic frequencies and formants given a multi-resolution pitch estimate. The weaker talker's speech is obtained from the residual signal created when the harmonics and formants of the stronger talker are suppressed. A maximum likelihood speaker assignment algorithm is used to place the recovered frames from the target and interfering talkers in separate channels. The system has been tested at target-to-interferer ratios (TIRs) from -18 to 18 dB with human listening tests, and with machine-based tests employing a keyword spotting system on the Switchboard Corpus for target talkers at 6, 12, and 18 dB TIR.","PeriodicalId":300119,"journal":{"name":"1995 International Conference on Acoustics, Speech, and Signal Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127896400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}