{"title":"The radix-r one stage FFT kernel computation","authors":"Marwan A. Jaber, D. Massicotte","doi":"10.1109/ICASSP.2008.4518427","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518427","url":null,"abstract":"The FFT process is an operation that could be performed through different stages. In each stage, the butterfly operation is computed in which the accessed data is multiplied by certain Walpha, added or subtracted and finally it is stored or held for further processing. This process is repeated to each stage until the final stage where the processed data is driven to the output. In this paper, an appropriate indexing or mapping schemes between the input data and the coefficient multipliers throughout the different stages are yield to a computation single stage by collapsing all stages into a computation single stage. The result is a reduction of communication load and arithmetic operations.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121787980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mickaël De Meuleneire, Hervé Taddei, Dominique Pastor
{"title":"Algebraic quantization of transform coefficients for embedded audio coding","authors":"Mickaël De Meuleneire, Hervé Taddei, Dominique Pastor","doi":"10.1109/ICASSP.2008.4518728","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518728","url":null,"abstract":"This paper proposes a new quantization for transform coefficients based on algebraic quantization. The coefficients are represented by a few pulses multiplied by a unique amplitude. The coefficients to be transmitted are selected by optimizing an error criterion, that determines the signs, positions and amplitudes of the pulses. This simple quantization has been implemented in a wavelet-based wideband scalable coder, and has been proved to provide a perceptually better quality than SPIHT on speech signal and music.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121811750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
W. Moran, D. Huynh, Michael Edwards, Andrew Harris, Xuezhi Wang, B. L. Scala
{"title":"On-belt analysis of minerals using naturally occurring gamma radiation","authors":"W. Moran, D. Huynh, Michael Edwards, Andrew Harris, Xuezhi Wang, B. L. Scala","doi":"10.1109/ICASSP.2008.4518448","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518448","url":null,"abstract":"We describe a method to analyze materials on a conveyor belt using natural gamma spectra collected with a BGO (bismuth germanate) gamma ray detector, which collects emissions from potassium (K), uranium (U), and thorium (Th) in the materials. A statistical model is proposed based on a Poisson process and an approximate maximum likelihood (ML) technique via the expectation-maximization (EM) algorithm is then used to estimate the amount of each of the three elements in the material. A refinement of the statistical model is used to estimate linear drift in the detector.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116864117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint Bayesian detection of brain activated regions and local HRF estimation in functional MRI","authors":"D. Afonso, J. Sanches, M. Lauterbach","doi":"10.1109/ICASSP.2008.4517640","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4517640","url":null,"abstract":"The blood-oxygenation-level-dependent (BOLD) signal, measured with the magnetic resonance imaging (MRI), is currently used to detect the activation of brain regions with a stimulus application, e.g., visual or auditive. In a block design approach, the stimuli (called paradigm in the fMRI scope) are designed to detect activated and non activated brain regions with maximized certainty. However, corrupting noise in MRI volumes acquisition, patient motion and the normal brain activity interference makes this detection a difficult task. In this paper a new Bayesian method, called SPM-MAP, is proposed where a joint detection of brain activated regions and estimation of the underlying hemodynamic impulse response function (HRF) is proposed. Monte Carlo tests on its error probability and HRF estimation with synthetic data are performed and presented.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117140056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Single sideband encoder for music coding in cochlear implants","authors":"K. Nie, L. Atlas, J. Rubinstein","doi":"10.1109/ICASSP.2008.4518583","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518583","url":null,"abstract":"The restoration of melody perception is a key remaining challenge in cochlear implants. We propose a new sound coding strategy that converts an audio signal into time-varying electrically stimulating pulse trains. A sound is first split into several frequency subbands and each subband signal is coherently downward shifted to a low-frequency base band, similar to demodulation used in single sideband (SSB) radios. These resulting coherent envelope signals have Hermitian symmetric frequency spectrums and are thus real-valued. A peak detector in each subband further converts the coherent envelopes into rate-varying and interleaved pulse trains. Acoustic simulations of cochlear implants with normal hearing listeners showed significant improvement in melody recognition over the most common stimulation approach used in cochlear implants.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"65 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120895093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive acoustic echo cancellation in the presence of multiple nonlinearities","authors":"K. Shi, Xiaoli Ma, G. Zhou","doi":"10.1109/ICASSP.2008.4518431","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518431","url":null,"abstract":"A Hammerstein-Wiener system consists of a linear time invariant subsystem sandwiched between two memoryless nonlinear blocks as is the case of an acoustic system with a nonlinear loudspeaker and a nonlinear microphone. We propose to model the memoryless nonlinear blocks of the Hammerstein-Wiener system using a linear combination of nonlinear basis functions, and concentrate on the task of parameter estimation for the nonlinear blocks. An adaptive algorithm is proposed using a pseudo magnitude squared coherence (PMSC) function-based criterion. The proposed method carries out nonlinearity identification without knowing the linear block in the Hammerstein-Wiener system. This is particularly useful for nonlinear acoustic echo cancellation (NAEC) applications, where dealing with the linear and nonlinear blocks together can be computationally challenging due to the long room impulse response. Numerical examples are provided to illustrate the performance of the proposed method.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"27 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120905497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Application of Doppler resilient complementary waveforms to target tracking","authors":"S. Suvorova, W. Moran, S. Howard, A. Calderbank","doi":"10.1109/ICASSP.2008.4517905","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4517905","url":null,"abstract":"The use of complementary codes as a means of reducing radar range sidelobes is well-known, but lack of resilience to Doppler is often cited as a reason not to deploy them. This work describes techniques for providing Doppler resilience with an emphasis on tailoring Doppler performance to the specific aim of target tracking. The Doppler performance can be varied by suitably changing the order of transmission of multiple sets of complementary waveforms. We have developed a method that improves Doppler performance significantly by arranging the transmission of multiple copies of complementary waveforms according to the first order Reed-Muller codes. Here we demonstrate significant tracking gains in the context of accelerating targets by the use of adaptively chosen waveform sequences of this kind, compared to both a fixed sequence of similar waveforms, and an LFM waveform.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"35 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120949767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Language modeling for voice search: A machine translation approach","authors":"Xiao Li, Y. Ju, G. Zweig, A. Acero","doi":"10.1109/ICASSP.2008.4518759","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518759","url":null,"abstract":"This paper presents a novel approach to language modeling for voice search based on the idea and method of statistical machine translation. We propose an n-gram based translation model that can be used for listing-to-query translation. We then leverage the query forms translated from listings to improve language modeling. The translation model is trained in an unsupervised manner using a set of transcribed voice search queries. Experiments show that the translation approach yielded drastic perplexity reductions compared with a baseline language model where no translation is applied.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121114829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Carlos Vaquero, Oscar Saz-Torralba, EDUARDO LLEIDA SOLANO, W. R. Rodríguez-Dueñas
{"title":"E-inclusion technologies for the speech handicapped","authors":"Carlos Vaquero, Oscar Saz-Torralba, EDUARDO LLEIDA SOLANO, W. R. Rodríguez-Dueñas","doi":"10.1109/ICASSP.2008.4518658","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518658","url":null,"abstract":"This paper addresses the problem that disabled people face when accessing the new systems and technologies that are available nowadays. The use of speech technologies, specially helpful for motor handicapped people, becomes unapproachable when these people also suffer speech impairments, making the gap in the society wider for them. As a way to include speech impaired people in the technological society of today, two lines of work have been carried out. On one hand, a computer-aided speech therapy software has been developed for the speech training of children with different disabilities. This tool, available for free distribution, makes use of different state-of-the-art speech technologies to train different levels of the language. As a result of this work, the software is being used currently in several centers for special education with a very encouraging feedback about the capabilities of the system. On the other hand, research on the use of automatic speech recognition (ASR) systems for the speech impaired has been carried out. This work has focused on current techniques of speaker adaptation to know how these techniques, fruitfully used in other tasks, can deal with this specific kind of speech. The use of Maximum A Posterior (MAP) obtains an improvement of 60.61% compared to the results of a baseline speaker independent model.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121215782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Tachibana, Shinsuke Izawa, Takashi Nose, Takao Kobayashi
{"title":"Speaker and style adaptation using average voice model for style control in HMM-based speech synthesis","authors":"M. Tachibana, Shinsuke Izawa, Takashi Nose, Takao Kobayashi","doi":"10.1109/ICASSP.2008.4518689","DOIUrl":"https://doi.org/10.1109/ICASSP.2008.4518689","url":null,"abstract":"We propose a technique for synthesizing speech with desired style expressivity of an arbitrary target speaker's voice. In an MLLR-based speaker adaptation technique for multiple regression hidden semi-Markov model (MRHSMM), the quality of synthesized speech crucially depends on the initial MRHSMM trained from a certain source speaker's data and it is not always possible to synthesize natural sounding speech with a given target speaker's voice. To overcome this problem, we perform simultaneous adaptation of speaker and style from an average voice model. Experimental results show that the proposed technique provides more natural sounding speech than the conventional one with speaker adaptation only.","PeriodicalId":333742,"journal":{"name":"2008 IEEE International Conference on Acoustics, Speech and Signal Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124883206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}