2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Reduced Dimension Minimum BER PSK Precoding for Constrained Transmit Signals in Massive MIMO 大规模MIMO中约束发射信号的降维最小误码率PSK预编码
A. L. Swindlehurst, H. Jedda, I. Fijalkow
{"title":"Reduced Dimension Minimum BER PSK Precoding for Constrained Transmit Signals in Massive MIMO","authors":"A. L. Swindlehurst, H. Jedda, I. Fijalkow","doi":"10.1109/ICASSP.2018.8461642","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461642","url":null,"abstract":"Recently a number of nonlinear precoding algorithms have been developed for designing a downlink transmit signal that is constrained by some nonlinearity, such as one-bit quantization, power-amplifier saturation or constant modulus. These methods use iterative search algorithms to directly design the signal that is transmitted from each antenna. Since the dimension of the search space equals the number of antennas, the computational complexity of these approaches can be high for massive MIMO scenarios. Thus, in this paper we pose the problem in a smaller dimensional space by constraining the signal prior to the nonlinearity to be the output of a linear precoder. The search is then over the vector of predistorted symbols at the input to the linear precoder, which is typically much smaller than the number of antennas. We focus on algorithms that minimize the bit error rate at the receivers, and show that performance can be obtained that is similar to algorithms that operate directly in the antenna domain.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"39 1","pages":"3584-3588"},"PeriodicalIF":0.0,"publicationDate":"2018-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89708378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Low Complexity Joint RDO of Prediction Units Couples for HEVC Intra Coding HEVC编码预测单元对的低复杂度联合RDO
Maxime Bichon, J. L. Tanou, M. Ropert, W. Hamidouche, L. Morin, Lu Zhang
{"title":"Low Complexity Joint RDO of Prediction Units Couples for HEVC Intra Coding","authors":"Maxime Bichon, J. L. Tanou, M. Ropert, W. Hamidouche, L. Morin, Lu Zhang","doi":"10.1109/ICASSP.2018.8462489","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462489","url":null,"abstract":"HEVC is the latest block-based video compression standard, outperforming H.264/AVC by 50% bitrate savings for the same perceptual quality. An HEVC encoder provides Rate-Distortion optimization coding tools for block-wise compression. Because of complexity limitations, Rate-Distortion Optimization (RDO) is usually performed independently for each block, assuming coding efficiency losses to be negligible. In this paper, we propose an acceleration solution for the Intra coding scheme named Dual-JRDO, which takes advantage of Inter-Block dependencies related to both predictive coding and CABAC. The Dual-JRDO improves Intra coding efficiency at the expense of higher computational complexity. The acceleration of the Dual-JRDO scheme includes adaptive use of the Dual-JRDO model based on source analysis, short-listing and early decisions strategies. The proposed Fast Dual-JRDO reduces the original model complexity by 89.54%, while providing tractable computation for average R-D gains of −0.45% (up to −0.82%) in the HM16.12 reference software model.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"15 1","pages":"1733-1737"},"PeriodicalIF":0.0,"publicationDate":"2018-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89820456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Non-Native Children Speech Recognition Through Transfer Learning 通过迁移学习的非母语儿童语音识别
M. Matassoni, R. Gretter, D. Falavigna, D. Giuliani
{"title":"Non-Native Children Speech Recognition Through Transfer Learning","authors":"M. Matassoni, R. Gretter, D. Falavigna, D. Giuliani","doi":"10.1109/ICASSP.2018.8462059","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462059","url":null,"abstract":"This work deals with non-native children's speech and investigates both multi-task and transfer learning approaches to adapt a multi-language Deep Neural Network (DNN) to speakers, specifically children, learning a foreign language. The application scenario is characterized by young students learning English and German and reading sentences in these second-languages, as well as in their mother language. The paper analyzes and discusses techniques for training effective DNN-based acoustic models starting from children's native speech and performing adaptation with limited non-native audio material. A multi -lingual model is adopted as baseline, where a common phonetic lexicon, defined in terms of the units of the International Phonetic Alphabet (IPA), is shared across the three languages at hand (Italian, German and English); DNN adaptation methods based on transfer learning are evaluated on significant non-native evaluation sets. Results show that the resulting non-native models allow a significant improvement with respect to a mono-lingual system adapted to speakers of the target language.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"137 1","pages":"6229-6233"},"PeriodicalIF":0.0,"publicationDate":"2018-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75301711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Ranking Using Transition Probabilities Learned from Multi-Attribute Data 基于转移概率的多属性数据排序
Sigurd Løkse, R. Jenssen
{"title":"Ranking Using Transition Probabilities Learned from Multi-Attribute Data","authors":"Sigurd Løkse, R. Jenssen","doi":"10.1109/ICASSP.2018.8462132","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462132","url":null,"abstract":"In this paper, as a novel approach, we learn Markov chain transition probabilities for ranking of multi -attribute data from the inherent structures in the data itself. The procedure is inspired by consensus clustering and exploits a suitable form of the PageRank algorithm. This is very much in the spirit of the original PageRank utilizing the hyperlink structure to learn such probabilities. As opposed to existing approaches for ranking multi -attribute data, our method is not dependent on tuning of critical user-specified parameters. Experiments show the benefits of the proposed method.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"2851-2855"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88983114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synthesis of Images by Two-Stage Generative Adversarial Networks 基于两阶段生成对抗网络的图像合成
Qiang Huang, P. Jackson, Mark D. Plumbley, Wenwu Wang
{"title":"Synthesis of Images by Two-Stage Generative Adversarial Networks","authors":"Qiang Huang, P. Jackson, Mark D. Plumbley, Wenwu Wang","doi":"10.1109/ICASSP.2018.8461984","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461984","url":null,"abstract":"In this paper, we propose a divide-and-conquer approach using two generative adversarial networks (GANs) to explore how a machine can draw colorful pictures (bird) using a small amount of training data. In our work, we simulate the procedure of an artist drawing a picture, where one begins with drawing objects' contours and edges and then paints them different colors. We adopt two GAN models to process basic visual features including shape, texture and color. We use the first GAN model to generate object shape, and then paint the black and white image based on the knowledge learned using the second GAN model. We run our experiments on 600 color images. The experimental results show that the use of our approach can generate good quality synthetic images, comparable to real ones.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"9 1","pages":"1593-1597"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77001369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Pulse-Stream Models in Time-of-Flight Imaging 飞行时间成像中的脉冲流模型
Adrien Besson, Dimitris Perdios, Y. Wiaux, J. Thiran
{"title":"Pulse-Stream Models in Time-of-Flight Imaging","authors":"Adrien Besson, Dimitris Perdios, Y. Wiaux, J. Thiran","doi":"10.1109/ICASSP.2018.8461767","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461767","url":null,"abstract":"This paper considers the problem of reconstructing raw signals from random projections in the context of time-of-flight imaging with an array of sensors. It presents a new signal model, coined as multi-channel pulse-stream model, which exploits pulse-stream models and accounts for additional structure induced by inter-sensor dependencies. We propose a sampling theorem and a reconstruction algorithm, based on ℓ -minimization, for signals belonging to such a model. We demonstrate the benefits of the proposed approach by means of numerical simulations and on a real non-destructive-evaluation application where the peak-signal-to-noise-ratio is increased by 3 dB compared to standard compressed-sensing strategies.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"1 1","pages":"3389-3393"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78417803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Emg Acquisition and Hand Pose Classification for Bionic Hands from Randomly-Placed Sensors 随机传感器仿生手的肌电信号采集与手部姿势分类
Sumit A. Raurale, J. McAllister, J. M. D. Rincón
{"title":"Emg Acquisition and Hand Pose Classification for Bionic Hands from Randomly-Placed Sensors","authors":"Sumit A. Raurale, J. McAllister, J. M. D. Rincón","doi":"10.1109/ICASSP.2018.8462409","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462409","url":null,"abstract":"This paper presents a unique real-time motion recognition system for Electromyographic (EMG) signal acquisition and classification. It is the first approach which can classify hand poses from multi-channel EMG signals gathered from randomly placed arm sensors as accurately as current placed-sensor EMG acquisition approaches. It combines time-domain feature extraction, Linear Discriminant Analysis (LDA) feature projection and Multilayer Perceptron (MLP) classification to allow nine distinct poses to be correctly identified more than 95% of the time. This is comparable to state-of-the-art placed-sensor EMG acquisition systems. Processing times of 11.70 ms also make this a viable candidate approach for real-time EMG acquisition and processing in practical prosthesis applications.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"50 1","pages":"1105-1109"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86457749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Statistical T+2d Subband Modelling for Crowd Counting 人群计数的统计T+2d子带建模
Deepayan Bhowmik, A. Wallace
{"title":"Statistical T+2d Subband Modelling for Crowd Counting","authors":"Deepayan Bhowmik, A. Wallace","doi":"10.1109/ICASSP.2018.8462345","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462345","url":null,"abstract":"Counting people automatically in a crowded scenario is important to assess safety and to determine behaviour in surveillance operations. In this paper we propose a new algorithm using the statistics of the spatio-temporal wavelet subbands. A t+2D lifting based wavelet transform is exploited to generate a motion saliency map which is then used to extract novel parametric statistical texture features. We compare our approach to existing crowd counting approaches and show improvement on standard benchmark sequences, demonstrating the robustness of the extracted features.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"27 1","pages":"1533-1537"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78030925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Inexact Proximal Operators for $ell_{p}$-Quasinorm Minimization $ell_{p}$-拟信息最小化的不精确近邻算子
Cian O'Brien, Mark D. Plumbley
{"title":"Inexact Proximal Operators for $ell_{p}$-Quasinorm Minimization","authors":"Cian O'Brien, Mark D. Plumbley","doi":"10.1109/ICASSP.2018.8462524","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462524","url":null,"abstract":"Proximal methods are an important tool in signal processing applications, where many problems can be characterized by the minimization of an expression involving a smooth fitting term and a convex regularization term - for example the classic $ell_{1}$ -Lasso. Such problems can be solved using the relevant proximal operator. Here we consider the use of proximal operators for the $ell_{p}$ -quasinorm where $0leq pleq 1$. Rather than seek a closed form solution, we develop an iterative algorithm using a Majorization-Minimization procedure which results in an inexact operator. Experiments on image denoising show that for $pleq 1$ the algorithm is effective in the high-noise scenario, outperforming the Lasso despite the inexactness of the proximal step.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"18 1","pages":"4724-4728"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81441379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech Segment Clustering for Real-Time Exemplar-Based Speech Enhancement 基于实例的实时语音增强的语音片段聚类
David Nesbitt, D. Crookes, J. Ming
{"title":"Speech Segment Clustering for Real-Time Exemplar-Based Speech Enhancement","authors":"David Nesbitt, D. Crookes, J. Ming","doi":"10.1109/ICASSP.2018.8461689","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461689","url":null,"abstract":"Exemplar-based (or Corpus-based) speech enhancement algorithms have great potential but are typically slow due to needing to search through the entire corpus. The properties of speech can be exploited to improve these algorithms. Firstly, a corpus can be clustered by a phonetic ordering into a search tree which can be used to find a best matching segment. This dramatically reduces the search space, reducing the time complexity of searching a corpus of $n$ segments from O(n) to O(log(n)). Secondly, clustering can be used to give a lossy compression of a speech corpus by replacing original segments with codewords. These techniques are shown in comparison with sequential search and non-compressed corpora using a simple speech enhancement algorithm. A combination of these techniques for a corpus of a quarter of WSJO results in a speedup of approximately 3000x.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"55 1","pages":"5419-5423"},"PeriodicalIF":0.0,"publicationDate":"2018-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91480521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信