{"title":"A real-time example-based single-image super-resolution algorithm via cross-scale high-frequency components self-learning","authors":"Chang Su, Li Tao","doi":"10.1109/ICASSP.2016.7471962","DOIUrl":"https://doi.org/10.1109/ICASSP.2016.7471962","url":null,"abstract":"In this paper, we propose a fast and dictionary-free example-based super-resolution (EBSR) algorithm to solve the contradiction in EBSR methods of their high performance in achieving high visual quality and their low efficiency and high costs. With a novel cross-scale high-frequency components (HFC) self-learning strategy, the missed HFC of a high-resolution (HR) image are approximated from its low-resolution counterparts. A high-quality estimation of the HR image is thus obtained by compensating the HFC to its initial guess. Simulations show that the proposed algorithm gets comparable results to the state-of-the-art EBSR but with much higher efficiency and lower costs.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"49 1","pages":"1676-1680"},"PeriodicalIF":0.0,"publicationDate":"2016-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81532488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vincent Mohammad Tavakoliy, Jesper Rindom Jenseny, Mads Graesboll Christenseny, Jacob Benestyz
{"title":"Pseudo-coherence-based MVDR beamformer for speech enhancement with ad hoc microphone arrays","authors":"Vincent Mohammad Tavakoliy, Jesper Rindom Jenseny, Mads Graesboll Christenseny, Jacob Benestyz","doi":"10.1109/ICASSP.2015.7178453","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7178453","url":null,"abstract":"Speech enhancement with distributed arrays has been met with various methods. On the one hand, data independent methods require information about the position of sensors, so they are not suitable for dynamic geometries. On the other hand, Wiener-based methods cannot assure a distortionless output. This paper proposes minimum variance distortionless response filtering based on multichannel pseudo-coherence for speech enhancement with ad hoc microphone arrays. This method requires neither position information nor control of the trade-off used in the distortion weighted methods. Furthermore, certain performance criteria are derived in terms of the pseudo-coherence vector, and the method is compared with the multichannel Wiener filter. Evaluation shows the suitability of the proposed method in terms of noise reduction with minimum distortion in ad hoc scenarios.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"2659-2663"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81468499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An outreach after-school program to introduce high-school students to electrical engineering","authors":"Monica F. Bugalloy, Angela M. Kellyz","doi":"10.1109/ICASSP.2015.7179031","DOIUrl":"https://doi.org/10.1109/ICASSP.2015.7179031","url":null,"abstract":"We report on a university-based pilot initiative to introduce students in grades 9–12 to electrical engineering practices. The after-school program consisted of two modules of four two-hour sessions and targeted students from two different local schools. They were exposed to hands-on electronic activities as well as programming practices related to image processing. The data collected from weekly surveys revealed that students found the program more challenging and engaging as the course progressed and they were motivated to pursue future engineering study. Additional schools in the region have requested the opportunity for their students to participate in the program at the university.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"20 1","pages":"5540-5544"},"PeriodicalIF":0.0,"publicationDate":"2015-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73801141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The convergence guarantees of a non-convex approach for sparse recovery using regularized least squares","authors":"Laming Chen, Yuantao Gu","doi":"10.1109/ICASSP.2014.6854221","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854221","url":null,"abstract":"Existing literatures suggest that sparsity is more likely to be induced with non-convex penalties, but the corresponding algorithms usually suffer from multiple local minima. In this paper, we introduce a class of sparsity-inducing penalties and provide the convergence guarantees of a non-convex approach for sparse recovery using regularized least squares. Theoretical analysis demonstrates that under some certain conditions, if the non-convexity of the penalty is below a threshold (which is in inverse proportion to the distance between the initialization and the sparse signal), the sparse signal can be stably recovered. Numerical simulations are implemented to verify the theoretical results in this paper and to compare the performance of this approach with other references.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"162 1","pages":"3350-3354"},"PeriodicalIF":0.0,"publicationDate":"2014-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73486620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. Shirota, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, K. Tokuda
{"title":"Pitch adaptive training for hmm-based singing voice synthesis","authors":"K. Shirota, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, K. Tokuda","doi":"10.1109/ICASSP.2014.6854062","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6854062","url":null,"abstract":"A statistical parametric approach to singing voice synthesis based on hidden Markov Models (HMMs) has been growing in popularity over the last few years. The spectrum, excitation, vibrato, and duration of singing voices in this approach are simultaneously modeled with context-dependent HMMs and waveforms are generated from the HMMs themselves. HMM-based singing voice synthesis systems are heavily based on the training data in performance because these systems are “corpus-based.” Therefore, HMMs corresponding to contextual factors that hardly ever appear in the training data cannot be well-trained. Pitch should especially be correctly covered since generated F0 trajectories have a great impact on the subjective quality of synthesized singing voices. We applied the method of “speaker adaptive training” (SAT) to “pitch adaptive training,” which is discussed in this paper. This technique made it possible to normalize pitch based on musical notes in the training process. The experimental results demonstrated that the proposed technique could alleviate the data sparseness problem.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"31 1","pages":"5377-5380"},"PeriodicalIF":0.0,"publicationDate":"2014-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86653818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuhong Yang, Hongjiang Yu, R. Hu, Li Gao, Song Wang, Qing Zhai, Songbo Xie
{"title":"Auditory attention based mobile audio quality assessment","authors":"Yuhong Yang, Hongjiang Yu, R. Hu, Li Gao, Song Wang, Qing Zhai, Songbo Xie","doi":"10.1109/ICASSP.2014.6853825","DOIUrl":"https://doi.org/10.1109/ICASSP.2014.6853825","url":null,"abstract":"","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"25 1","pages":"1389-1393"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73749375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large deviation delay analysis of queue-aware multi-user MIMO systems with two timescale mobile-driven feedback","authors":"Junting Chen, V. Lau","doi":"10.1109/ICASSP.2013.6638620","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6638620","url":null,"abstract":"Multi-user multi-input-multi-output (MU-MIMO) systems usually require users to feedback the channel state information (CSI) for scheduling. Most of the existing literature on the reduced feedback user scheduling focused on the throughput performance and the queueing delay was usually ignored. As the delay is important for real-time applications, it is desirable to have a low feedback queue-aware user scheduling algorithm for MU-MIMO systems. This paper proposes a two timescale queue-aware user scheduling algorithm, which consists of a queue-aware mobile-driven feedback filtering stage and a SINR-based user scheduling stage. The feedback policy is obtained by solving a queue-weighted optimization problem. In addition, we evaluate the associated queueing delay performance by using the large deviation analysis. The large deviation decay rate for the proposed algorithm is shown to be much larger than the CSI-only scheduling algorithm. Numerical results demonstrate the large performance gain of the proposed algorithm over the CSI-only algorithm, while the proposed one requires only a small amount of feedback.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"89 1","pages":"5036-5040"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84232648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A diagonalized newton algorithm for non-negative sparse coding","authors":"H. V. hamme","doi":"10.1109/ICASSP.2013.6639080","DOIUrl":"https://doi.org/10.1109/ICASSP.2013.6639080","url":null,"abstract":"Signal models where non-negative vector data are represented by a sparse linear combination of non-negative basis vectors have attracted much attention in problems including image classification, document topic modeling, sound source segregation and robust speech recognition. In this paper, an iterative algorithm based on Newton updates to minimize the Kullback-Leibler divergence between data and model is proposed. It finds the sparse activation weights of the basis vectors more efficiently than the expectation-maximization (EM) algorithm. To avoid the computational burden of a matrix inversion, a diagonal approximation is made and therefore the algorithm is called diagonal Newton Algorithm (DNA). It is several times faster than EM, especially for undercomplete problems. But DNA also performs surprisingly well on overcomplete problems.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"29 1","pages":"7299-7303"},"PeriodicalIF":0.0,"publicationDate":"2013-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73828322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lossy compression of sparse histogram image","authors":"M. Iwahashi, H. Kobayashi, H. Kiya","doi":"10.1109/ICASSP.2012.6288143","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288143","url":null,"abstract":"In this paper, a lossy data compression for a sparse histogram image signal is proposed. It is extended from an existing lossless coding which is based on a lossless histogram packing and a lossless coding. We introduce a lossy mapping, which has less computational load than the rate-distortion optimized Lloyd-Max quantization, and combine it with a lossless coding. It was confirmed that the proposed method attains higher performance in the rate-distortion plane than existing methods. This is because it can utilize histogram sparseness of images, and also its inverse mapping does not magnify quantization noise.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"409 1","pages":"1361-1364"},"PeriodicalIF":0.0,"publicationDate":"2012-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77355736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effect of anti-aliasing filtering on the quality of speech from an HMM-based synthesizer","authors":"Y. Shiga","doi":"10.1109/ICASSP.2012.6288924","DOIUrl":"https://doi.org/10.1109/ICASSP.2012.6288924","url":null,"abstract":"This paper investigates how the quality of speech produced through statistical parametric synthesis is affected by anti-aliasing filtering, i.e., low-pass filtering that is applied prior to (down-) sampling prerecorded speech at a desired rate. It has empirically been known that the frequency response of such anti-aliasing filters influences the quality of speech synthesized to a considerable degree. For the purpose of understanding such influence more clearly, in this paper we examine the spectral aspects of speech involved in the processes of HMM training and synthesis. We then propose a technique of feature extraction that can avoid producing the roll-off feature of the frequency response near the Nyquist frequency, which is found to be the major cause of speech quality degradation resulting from anti-aliasing filtering. In the technique, the spectrum is first computed from speech at a sampling rate higher than the desired rate, then it is truncated so that its frequency range above the target Nyquist frequency is discarded, and finally the truncated spectrum is converted directly into the cepstrum. Listening test results show that the proposed technique enables training HMMs efficiently with a limited number of model parameters and effectively with less artifacts in the speech synthesized at a desired sampling rate.","PeriodicalId":6443,"journal":{"name":"2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"93 20 1","pages":"4525-4528"},"PeriodicalIF":0.0,"publicationDate":"2012-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83488176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}