{"title":"Modeling spectral speech transitions using temporal decomposition techniques","authors":"G. Ahlbom, F. Bimbot, G. Chollet","doi":"10.1109/ICASSP.1987.1169742","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169742","url":null,"abstract":"ATAL [1] introduced a technique for decomposing speech into phone-length temporal events in terms of overlapping and interacting articulatory gestures. This paper reports on simplifications of this technique with applications to acoustic-phonetic synthesis. Spectral evolution is represented by time-indexed trajectories in the p-dimensional space of Log-Area Ratios{y_{i}= Ln ((1+k_{i})/(1-k_{i}))}where kiare the reflection coefficients obtained from short-time stationary LPC analysis. The vocal tract configuration (spectral vector) associated with each interpolation function belongs to a finite set of articulatory targets (vector quantization code book). A set of speech segments (\"polysons\") has been encoded using this technique. It includes diphones, demi-syllables, and other units that are difficult to segment. Temporal decomposition using target spectra can break the complex encoding of these segments. In particular, coarticulation effects are analyticaiy explained and modeled. It is demonstrated that these new tools provide an adequate environment in our search for better rules in acoustic speech synthesis.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129622445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A speaker-stress resistant HMM isolated word recognizer","authors":"D. Paul","doi":"10.1109/ICASSP.1987.1169551","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169551","url":null,"abstract":"Most current speech recognition systems are sensitive to variations in speaker style, the following is the result of an effort to make a Hidden Markov Model (HMM) Isolated Word Recognizer (IWR) tolerant to such speech changes caused by speaker stress. More than an order-of-magnitude reduction of the error rate was achieved for a 105 word simulated stress database and a 0% error rate was achieved for the TI 20 isolated word database.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127849380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Serial/Parallel architectures for area-efficient vector multiplication","authors":"Stewart Smith, P. Denyer","doi":"10.1109/ICASSP.1987.1169690","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169690","url":null,"abstract":"The use of standard-part multiply/accumulators in digital signal processing is often in the computation of vector products. In the realm of custom VLSI, direct computation of vector products can result in area savings over classical multiply/accumulate methods. A methodology is presented for composition of VLSI architectures for direct vector multiplication, based on three fundamental computational elements. These are register, data selecter, and carry-save add-shift (CSAS) computer. The CSAS computer is a linear array of gated carry-save adders which performs shifting accumulation of partial results. Two's complement serial/parallel carry-save accumulation provides performance, while the use of symmetric-coded distributed arithmetic eliminates redundant computation to effect area-savings.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"1082 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120878729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Non-linear adaptive signal processor","authors":"M. Lagunas, F. Vallverdú, M. Santamaria","doi":"10.1109/ICASSP.1987.1169602","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169602","url":null,"abstract":"This paper is a first attempt to give formalism to non-linear system design and in which context, related with similar linear processing techniques, they are located. A summary on the relation-ship of linear objectives and classical adaptive algorithms, in non-linear design problems, introduces the paper; giving the potential of random search techniques in order to open the different problems in non-linear objectives that could be handled with them. After, the similarity between probability distribution functions and power spectral density in linear processing is shown. This is supported by a nice example of non-linear system design. Finally, some prospective work is reported in the problem of adaptive companding design.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115987045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Suppression and detection of impulse type interference using adaptive median hybrid filters","authors":"A. Nieminen, P. Heinonen, Y. Neuvo","doi":"10.1109/ICASSP.1987.1169749","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169749","url":null,"abstract":"In this paper, we introduce a new type of nonlinear filters, the Adaptive Median Hybrid (AMH) filters, for the suppression and detection of short duration interferences. In the AMH filters, adaptive filter substructures are used to estimate the current signal value from the future and past signal values. The output of the overall filter is the median of the adaptive filter outputs and the current signal value. This kind of nonlinear filter structure is shown to adapt and preserve rapid changes in signal characteristics well. However, it filters out short duration interferences. By examining the difference between the original and filtered data, interferences can be detected. We introduce two types of AMH filters, the AMH filter with separate adaptive substructures (SAMH) and the AMH filter with coupled substructures (CAMH), which have different convergence properties and implementation. We use both synthetic and real data (speech and electroencephalogram (EEG)) to show the applicability of the proposed filters.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130715676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstructing a finite length sequence from several of its correlation lags","authors":"A. Steinhardt","doi":"10.1109/ICASSP.1987.1169415","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169415","url":null,"abstract":"In this paper we present an algorithm which answers the following question: Given a finite number of correlation lags, what is the shortest length sequence which could have produced these correlations? This question is equivalent to asking for the minimum order moving average (all-zero) model which can match a given set of correlations. The algorithm applies to both the case of uniform correlations and missing lag correlations. The algorithm involves quadratic programming coupled with a new representation of the boundary of correlations derived from finite sequences in terms of the spectral decomposition of a certain class of banded Toeplitz matrices.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121951525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A data-driven organization of the dynamic programming beam search for continuous speech recognition","authors":"H. Ney, D. Mergel, A. Noll, A. Paeseler","doi":"10.1109/ICASSP.1987.1169844","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169844","url":null,"abstract":"This paper describes a data-driven organization of the dynamic programming beam search for large vocabulary, continuous speech recognition. This organization can be viewed as an extension of the one-pass dynamic programming algorithm for connected word recognition. In continuous speech recognition we are faced with a huge search space, and search hypotheses have to be formed at the 10-ms level. The organization of the search presented has the following characteristics. Its computational cost is proportional only to the number of hypotheses actually generated and is independent of the overall size of the potential search space. There is no limit on the number of word hypotheses, there is only a limit to the overall number of hypotheses due to memory constraints. The implementation of the search has been studied and tested on a continuous speech data base comprising 20672 words.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"304 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123077927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bit reverse unscrambling for a radix-2MFFT","authors":"C. Burrus","doi":"10.1109/ICASSP.1987.1169492","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169492","url":null,"abstract":"The traditional Cooley-Tukey and the prime factor FFT algorithms either produce the output in scrambled order or the input data order must be prescrambled. Several methods for scrambling and unscrambling the DFT are presented. The new result in this paper is the observation that the radix-4, radix-8, or any radix-2mFFT can be modified to give the output in the same bit-reversed order as the radix-2 FFT.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121275400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Generalized linear inversion applied to seismic data in one and two dimensions","authors":"J. Justice, S. Dougherty","doi":"10.1109/ICASSP.1987.1169379","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169379","url":null,"abstract":"Generalized linear inversion (GLI) is a parameter estimation technique which shows great potential for use in solving inverse problems in many fields, including exploration seismology. We suggest a particular implementation of the procedure which may be used for simultaneous parameter estimation, and illustrate its use with the 1-D seismic deconvolution problem. The procedure is easily extended to the multidimensional case, and we illustrate this extension by computing depth and velocity structure in a flat-layer model using multiple offset data.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125191478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the automatic segmentation of speech signals","authors":"T. Svendsen, F. Soong","doi":"10.1109/ICASSP.1987.1169628","DOIUrl":"https://doi.org/10.1109/ICASSP.1987.1169628","url":null,"abstract":"For large vocabulary and continuous speech recognition, the sub-word-unit-based approach is a viable alternative to the whole-word-unit-based approach. For preparing a large inventory of subword units, an automatic segmentation is preferrable to manual segmentation as it substantially reduces the work associated with the generation of templates and gives more consistent results. In this paper we discuss some methods for automatically segmenting speech into phonetic units. Three different approaches are described, one based on template matching, one based on detecting the spectral changes that occur at the boundaries between phonetic units and one based on a constrained-clustering vector quantization approach. An evaluation of the performance of the automatic segmentation methods is given.","PeriodicalId":140810,"journal":{"name":"ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1987-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122302158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}