2020 International Conference on Signal Processing and Communications (SPCOM)最新文献

筛选
英文 中文
Intelligibility Improvement of Dysarthric Speech using MMSE DiscoGAN 使用MMSE DiscoGAN改善困难言语的可理解性
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179511
Mirali Purohit, Maitreya Patel, Harshit Malaviya, Ankur T. Patil, Mihir Parmar, Nirmesh J. Shah, Savan Doshi, H. Patil
{"title":"Intelligibility Improvement of Dysarthric Speech using MMSE DiscoGAN","authors":"Mirali Purohit, Maitreya Patel, Harshit Malaviya, Ankur T. Patil, Mihir Parmar, Nirmesh J. Shah, Savan Doshi, H. Patil","doi":"10.1109/SPCOM50965.2020.9179511","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179511","url":null,"abstract":"Dysarthria is a manifestation of the disordering in articulatory parts that are used during speech production, which results in uneven, slow, slurred, monotone speech or speech in an abnormal rhythm. People with dysarthria produce less intelligible speech. Improving the intelligibility of dysarthric speech is challenging because unlike normal speech, there is less amount of data for dysarthric speech. It is a known fact that dysarthric speech and normal speech are different in speech production-perception perspectives. Recently, Generative Adversarial Network (GAN)-based architectures have become more popular to learn such kind of cross-domain relationships efficiently. In this paper, we propose to use Discover GAN (DiscoGAN) along with Mean Square Error (MSE) regularization (i.e., MMSE DiscoGAN) for Dysarthric-to-Normal speech conversion. In particular, a direct feature-based mapping technique is used to train all the models. In the end, we use the Automatic Speech Recognition (ASR) to measure the Phoneme Error Rate (PER) for a particular speaker. Proposed method is compared with baseline Deep Neural Network (DNN)-based system. Training of both the architectures and the evaluations were carried out on UA corpus. By analyzing the results, we observed that MMSE DiscoGAN outperforms DNN by 13.16% and 9.64% for male and female, respectively. Moreover, proposed GAN-based frameworks efficiently improve the intelligibility of dysarthric speech, and generate more naturalsounding speech compared to the DNN-based models. Keywords: Dysarthric Speech, Voice Conversion, DNN, DiscoGAN, ASR.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125551271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Sample Complexity Lower Bounds for Compressive Sensing with Generative Models 基于生成模型的压缩感知样本复杂度下界
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179594
Zhaoqiang Liu, J. Scarlett
{"title":"Sample Complexity Lower Bounds for Compressive Sensing with Generative Models","authors":"Zhaoqiang Liu, J. Scarlett","doi":"10.1109/SPCOM50965.2020.9179594","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179594","url":null,"abstract":"Recently, it has been shown that for compressive sensing, significantly fewer measurements may be required if the sparsity assumption is replaced by the assumption the unknown vector lies near the range of a suitably-chosen generative model. In particular, in (Bora et at., 2017) it was shown roughly O(k log L) random Gaussian measurements suffice for accurate recovery when the generative model is an L-Lipschitz function with bounded k-dimensional inputs, and O(kd log w) measurements suffice when the generative model is a k-input ReLU network with depth d and width w. In this paper, we establish corresponding algorithm-independent lower bounds on the sample complexity using tools from minimax statistical analysis. In accordance with the above upper bounds, our results are summarized as follows: (i) We construct an L-Lipschitz generative model capable of generating group-sparse signals, and show that the resulting necessary number of measurements is $Omega(klog L)$; (ii) Using similar ideas, we construct ReLU networks with high depth and/or high width for which the necessary number of measurements scales as $Omegaleft(k d frac{log w}{log n}right)$ (with output dimension n), and in some cases $Omega(k d log w)$. As a result, we establish that the scaling laws derived in (Bora et al$.,2017$) are optimal or near-optimal in the absence of further assumptions.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"489 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128814978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ℓ1/ℓ0 Regularized Conjugate Gradient Based Sparse Adaptive Algorithms 基于正则化共轭梯度的稀疏自适应算法
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179548
R. Das
{"title":"ℓ1/ℓ0 Regularized Conjugate Gradient Based Sparse Adaptive Algorithms","authors":"R. Das","doi":"10.1109/SPCOM50965.2020.9179548","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179548","url":null,"abstract":"Adaptive algorithms in general yield slow convergence rate while identifying systems with colored input. In this context, the Adaptive Conjugate Gradient (ACG) algorithm shows fast convergence for colored input. However, the ACG algorithm do not exploit system sparsity. In this paper, the conjugate gradient based sparse adaptive algorithms are proposed. In particular, $ell_{1}$ and $ell_{0}$ norm penalties are added to the cost function of the ACG algorithm in order to attract the inactive taps to their optimum (i.e., zero) levels, and the resulting algorithms yield better steady-state performance. Simulation results show that the proposed algorithm outperforms recently proposed $ell_{0-}$ Recursive Least Square $(ell_{0^{-}}$RLS) algorithm.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127708776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Codes with Availability and different Localities 具有可用性和不同位置的代码
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179518
Ujwal Deep Kadiyam, Smarajit Das
{"title":"Codes with Availability and different Localities","authors":"Ujwal Deep Kadiyam, Smarajit Das","doi":"10.1109/SPCOM50965.2020.9179518","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179518","url":null,"abstract":"Any code symbol in a locally recoverable code (LRC) can be recovered by accessing at most r other code symbols (called a recovery set). Codes where any local code is protected by a local code of minimum Hamming distance at least δ is called an $(r,delta)$ code. If each code symbol has t disjoint recovery sets then the code is called an LRC with availability. In this letter we consider a new class of codes with availability. In these codes the $l^{th}, 1leq lleq t$ disjoint recovery set for any code symbol has locality $riota$ and it will be protected by a local code of minimum Hamming distance at least $delta_{l}$. We provide a new definition for availability. We derive an upper bound on the minimum Hamming distance of the new class of codes. We describe an additional property of this code that it allows all the k information symbols to be accessed t times simultaneously and thus providing availability t for all the k information symbols.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121265087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Overlapped/Non-Overlapped Speech Transition Point Detection Using Bag-of-Audio-Words 基于bag -of- words的重叠/非重叠语音过渡点检测
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179591
Shikha Baghel, S. Prasanna, P. Guha
{"title":"Overlapped/Non-Overlapped Speech Transition Point Detection Using Bag-of-Audio-Words","authors":"Shikha Baghel, S. Prasanna, P. Guha","doi":"10.1109/SPCOM50965.2020.9179591","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179591","url":null,"abstract":"Overlapped speech refers to an audio signal which contains speech of two or more speakers speaking simultaneously. Overlapped speech is one of the main sources of error for speaker diarization systems. This work presents an initial study to identify the transition points of overlapped to non-overlapped speech and vice-versa. Characteristics of overlapped and non-overlapped speech are examined in terms of the vocal tract system, excitation source, and modulation spectrum. The Hilbert envelope (HE) of Linear Prediction (LP) residual signal represents the excitation source characteristics of speech signal. The Sum of Ten Largest Peaks (STLP) of the spectrum and Mel-Frequency Cepstral Coefficients (MFCCs) represent the vocal tract shape information. The modulation spectrum energy (ModSE) captures the information of slowly varying temporal envelope of speech. A Bag-of-Audio-Words (BoAW) based approach is used to detect the transition points. News debates are one of the main sources of naturally occurred overlapped speech. Therefore, the present work is evaluated on Indian news debate scenario. A high Identification Rate (IR) and low Spurious Rate (SR) is observed when all the features are used simultaneously as a 16d feature(13-MFCCs, HE of LP residual, STLP and ModSE) for the detection task.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126882941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predominant Instrument Recognition in Polyphonic Music Using GMM-DNN Framework 基于GMM-DNN框架的复调音乐优势乐器识别
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179626
Roshni Ajayakumar, R. Rajan
{"title":"Predominant Instrument Recognition in Polyphonic Music Using GMM-DNN Framework","authors":"Roshni Ajayakumar, R. Rajan","doi":"10.1109/SPCOM50965.2020.9179626","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179626","url":null,"abstract":"In this paper, the predominant instrument recognition in polyphonic music is addressed using timbral descriptors in three frameworks-Gaussian mixture model (GMM), deep neural network (DNN), and hybrid GMM-DNN. Three sets of features, namely, mel-frequency cepstral coefficient (MFCC) features, modified group delay features (MODGDF), and lowlevel timbral features are computed, and the experiments are conducted with individual set and its early integration. Performance is systematically evaluated using IRMAS dataset. The results obtained for GMM, DNN, and GMM-DNN are 65.60%, 85.60%, and 93.20%, respectively on timbral feature fusion. Architectural choice of DNN using GMM derived features on the feature fusion paradigm showed improvement in the system performance. Thus, the proposed experiments demonstrate the potential of timbral descriptors and DNN based systems in recognizing predominant instrument in polyphonic music.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131584499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Laser and Nonlinear Phase Noise Tracking and Estimation in Long Haul 400 Gbps Single Carrier PDM-16-QAM Systems using Multi-step Kalman Filter 基于多步卡尔曼滤波的400 Gbps单载波PDM-16-QAM系统激光和非线性相位噪声跟踪与估计
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179555
Srishti Sharma, P. Krishnamurthy
{"title":"Laser and Nonlinear Phase Noise Tracking and Estimation in Long Haul 400 Gbps Single Carrier PDM-16-QAM Systems using Multi-step Kalman Filter","authors":"Srishti Sharma, P. Krishnamurthy","doi":"10.1109/SPCOM50965.2020.9179555","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179555","url":null,"abstract":"Coherent optical communication system performance is often limited by LPN (laser phase noise) and NLPN (nonlinear phase noise). A n-step Kalman filter to track and estimate LPN and NLPN is proposed for 400 Gbps PDM-16-QAM coherent optical system. Simulations show that for 100 kHz laser linewidth, reducing the sampling factor of filter by 10, Q-factor of 14.5 dB can be achieved over 1200 km transmission and $lt 1mathrm{d}mathrm{B}$ degradation is observed for n=20. Q-factor curves indicate that multi-step Kalman filter (MKF) performs better than linear step Kalman filter. MKF with $m=10$ can successfully mitigate the LPN and NLPN for the proposed system with maximum laser linewidth tolerance upto 1 MHz over $40times 80$ km. Results show that MKF can optimally track carrier phase noise and Kerr nonlinearities, thereby validating our proposed filter design.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130815870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Online Service Policies for Content Delivery 内容交付的在线服务策略
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179501
Kota Srinivas Reddy, Arunabh Saxena, Sharayu Moharir, N. Karamchandani
{"title":"Online Service Policies for Content Delivery","authors":"Kota Srinivas Reddy, Arunabh Saxena, Sharayu Moharir, N. Karamchandani","doi":"10.1109/SPCOM50965.2020.9179501","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179501","url":null,"abstract":"Content Delivery Networks (CDNs) are an essential component of Video on Demand (VoD) services. We consider a content delivery system comprising a central server connected to several co-located caches, each with limited storage and service capabilities. We evaluate the performance of our storage policy when coupled with online service policies as a function of cache size, file library size, and the number of caches. We show that for file libraries with Zipf popularity profiles, our storage policy, when coupled with any online service policy, which allocates each request to a cache if possible, is as effective as the best offline service policy existing in the literature. The advantage of online service policies is that they have smaller time complexity than offline service policies. We also support our theoretical results via simulations for finite-size systems.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"16 8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133049904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sliding Eigenvalue Decomposition for Non-stationary Signal Analysis 非平稳信号的滑动特征值分解
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179495
V. Singh, R. B. Pachori
{"title":"Sliding Eigenvalue Decomposition for Non-stationary Signal Analysis","authors":"V. Singh, R. B. Pachori","doi":"10.1109/SPCOM50965.2020.9179495","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179495","url":null,"abstract":"Nowadays, decomposition of multi-component signals has gained popularity in time-frequency analysis (TFA) of non-stationary signals. Eigenvalue decomposition (EVD) is one such technique which decomposes signals into mono-components. In this paper, a new approach named sliding EVD for non-stationary signal decomposition has been proposed. The sliding EVD comprises short duration EVD of signals and an unsupervised grouping of obtained components. This proposed algorithm surpasses other EVD based techniques by successfully decomposing the signals which are overlapped in frequency domain and separated in time-frequency domain. Later, Hilbert spectral analysis has been used on decomposed mono-components for obtaining time-frequency distribution (TFD). At the end, proposed method has been compared with Hilbert Huang transform and is found to be providing better TFD.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115132665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sparse Framework for Personal Sound Reproduction with Differential Phase Constraint 差分相位约束下个人声音再现的稀疏框架
2020 International Conference on Signal Processing and Communications (SPCOM) Pub Date : 2020-07-01 DOI: 10.1109/SPCOM50965.2020.9179508
Ajay Dagar, R. Hegde
{"title":"Sparse Framework for Personal Sound Reproduction with Differential Phase Constraint","authors":"Ajay Dagar, R. Hegde","doi":"10.1109/SPCOM50965.2020.9179508","DOIUrl":"https://doi.org/10.1109/SPCOM50965.2020.9179508","url":null,"abstract":"Sparse multi-zone sound reproduction provides multiple users a personalized sound experience. The primary objective of these methods is to minimize the reproduction error in the zones of interest along with the simultaneous reduction in the number of active of loudspeakers. The accuracy of these methods depends on how well the reproduced sound field resembles the desired sound field. In this work, a sparse framework for personal sound reproduction using a differential phase constraint is developed. The additional constraint enhances the spatial quality of the reproduced sound field in the bright zones by preserving the phase component from the direction of interest. The formulated optimization problems is tested in simulated environment with different scenarios. The performance of the proposed framework is evaluated objectively on the basis of obtained acoustic contrast, reproduction error and the directivity of the reproduced sound fields.","PeriodicalId":208527,"journal":{"name":"2020 International Conference on Signal Processing and Communications (SPCOM)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124382631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信