{"title":"Recursive implementation of total least squares algorithm for image reconstruction from noisy, undersampled multiframes","authors":"N. Bose, H. C. Kim, H. M. Valenzuela","doi":"10.1109/ICASSP.1993.319799","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319799","url":null,"abstract":"It is shown how the total least squares recursive algorithm for the real data FIR (finite impulse response) adaptive filtering problem can be applied to reconstruct a high-resolution filtered image from undersampled, noisy multiframes, when the interframe displacements are not accurately known. This is done in the wavenumber domain after transforming the complex data problem to an equivalent real data problem, to which the algorithm developed by C.E. Davila (Proc. ICASSP 1991 p.1853-6 of 1991) applies. The procedure developed also applies when the multiframes are degraded by linear shift-invariant blurs. All the advantages of implementation via massively parallel computational architecture apply. The performance of the algorithm is verified by computer simulations.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127619744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spatially-varying IIR filter banks for image coding","authors":"W. Chung, Mark J. T. Smith","doi":"10.1109/ICASSP.1993.319875","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319875","url":null,"abstract":"The application of spatially variant IIR (infinite impulse response) filter banks to subband image coding is reported. The new filter bank is based on computationally efficient recursive polyphase decompositions that dynamically change in response to the input signal. In the absence of quantization, reconstruction can be made exact. However, by proper choice of an adaptation scheme, it is shown that subband image coding based on time-varying filter banks can yield improvement over the use of conventional filter banks. Simulation results are presented.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127791034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The mean field theory for image motion estimation","authors":"J. Zhang, J. Hanauer","doi":"10.1109/ICASSP.1993.319781","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319781","url":null,"abstract":"It is shown how the MFT (mean field theory) can be applied to MRF (Markov random field) model-based motion estimation. Specifically, the motion is characterized by a coupled MRF including a displacement field (motion continuity), a line field (motion discontinuity), and a segmentation field (identifying uncovered areas). These fields are estimated by using the MFT. The efficacy of this approach is demonstrated on synthetic and real-world images.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"342 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131661149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Source waveform recovery in a reverberant space by cepstrum dereverberation","authors":"M. Tohyama, R. Lyon, T. Koike","doi":"10.1109/ICASSP.1993.319079","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319079","url":null,"abstract":"The minimum-phase components of source waveforms can be recovered by blind dereverberation using the minimum-phase complex cepstrum. This recovery process is robust for changes in the transfer functions (TFs), since the all-pass components of the responses that are highly sensitive to changes in the TFs can be disregarded. Yet it is still necessary to keep all-pass information if the timing of the source event is required. Dereverberation of reverberant speech signals requires the all-pass components to keep the pitch information.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133726403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Speech discrimination in adverse conditions using acoustic knowledge and selectively trained neural networks","authors":"Y. Anglade, D. Fohr, J. Junqua","doi":"10.1109/ICASSP.1993.319290","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319290","url":null,"abstract":"It is demonstrated that the STNN (selectively trained neural network) method improves confusable work discrimination. Tests conducted on clean and Lombard-noisy speech show that using only a small part (two frames) of the work where useful information for discrimination is located is more efficient than taking into account the whole word. Recognition scores obtained with a continuous-density HMM (hidden Markov model) are lower than those obtained with the proposed method. The present results show an increase in recognition accuracy for the tests on Lombard-noisy speech when the training is done on clean, Lombard, and Lombard-noisy speech. Furthermore, if the same noise is used for the training and the test, the STNN performances improve far more than those of the HMM. The STNN method does not need any precise detection of word boundaries. This influences the robustness of the method, especially in noisy conditions.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"93 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115666648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking nonstationarities with a wavelet transform","authors":"H. Krim, J. Pesquet, K. Drouiche","doi":"10.1109/ICASSP.1993.319101","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319101","url":null,"abstract":"Nonstationary signal parameter estimation/detection is challenging on account of the underlying stationarity assumption in most of the classical techniques. The authors present a framework for a class of nonstationary processes via a multiscale analysis. This framework gives insight into the problem, and new results are obtained on multiscale autoregressive integrated moving average (ARIMA) processes. The possibility of inducing stationarity at different resolution levels of nonstationary processes by an appropriate wavelet transform is shown. This permits use of classical estimation/detection techniques. The approach is extended to wavelet package decompositions.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115735854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Wakefield, B. J. Feng, K. R. Raghavan, M. Blommer
{"title":"SP Education Laboratory","authors":"G. Wakefield, B. J. Feng, K. R. Raghavan, M. Blommer","doi":"10.1109/ICASSP.1993.319890","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319890","url":null,"abstract":"The SP (signal processing) Education Laboratory (an educational project at the University of Michigan) is a software package designed to provide a highly interactive, flexible environment within which students can explore basic concepts of signal processing. The authors summarize the basic features of this software, some of the reasoning behind certain design decisions, and experience in using this software as a teaching tool.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124428322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motion estimation and motion-compensated filtering of video signals","authors":"E. Dubois, J. Konrad","doi":"10.1109/ICASSP.1993.319063","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319063","url":null,"abstract":"The authors consider methods for estimating 2-D motion in time-varying images for application to motion-compensated filtering. The approach is based on the minimization of objective functions that can be interpreted as energies of suitable Gibbs-Markov random fields. A flexible class of cost functions is described that can be applied in a wide variety of specific applications, including the estimation of motion trajectories over several image frames. The issues of minimizing the cost function and applications to motion-compensated filtering are briefly addressed.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114355199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inference of letter-phoneme correspondences with pre-defined consonant and vowel patterns","authors":"R. Luk, R. Damper","doi":"10.1109/ICASSP.1993.319269","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319269","url":null,"abstract":"The authors describe the automatic inferencing of letter-phoneme correspondences with predefined consonant and vowel patterns, which imply a segmentation of the word in one domain. The technique obtains the maximum likelihood (ML) alignment of the training word, and correspondences are found according to where the segmentation projects onto the ML alignment. Here, the phoneme strings were segmented depending on the number of consonant phonemes preceding or following the vowel phoneme. Sets of correspondences were evaluated according to the performance obtained when they were used for text-phonemic alignment and translation. The number of correspondences inferred was too large to evaluate using Markov statistics. Instead, hidden Markov statistics were used, where the storage demand is further reduced by a recording technique. Performance improves significantly as the number of consonants included in the pattern is increased. The performance of correspondences with predefined V.C* patterns was consistently better than with C*.V patterns.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114905984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New-word addition and adaptation in a stochastic explicit-segment speech recognition system","authors":"A. Asadi, H. Leung","doi":"10.1109/ICASSP.1993.319894","DOIUrl":"https://doi.org/10.1109/ICASSP.1993.319894","url":null,"abstract":"The authors extend on automatic procedure for the addition of new words to a speech recognition system to include alternative pronunciations for the new words. They investigate methods for adaptation to new words after these are added to the system. For adaptation, the goal was the improvement of the accuracy of the system on the new words, using only a limited amount of speech data. All the experiments are performed within the stochastic explicit-segment speech recognition system. The authors evaluated 25 isolated city names from a speech corpus, CITRON, collected from real users over the telephone network. For this task, improvement in accuracy is shown from a 34% error rate, when trained on the NTIMIT database alone, to 8% after adapting to 30 tokens, on average, from each new word.<<ETX>>","PeriodicalId":428449,"journal":{"name":"1993 IEEE International Conference on Acoustics, Speech, and Signal Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115020395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}