2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)最新文献

筛选
英文 中文
Exploiting Convolutional Neural Networks for Phonotactic Based Dialect Identification 利用卷积神经网络进行语音法方言识别
M. Najafian, Sameer Khurana, Suwon Shon, Ahmed Ali, James R. Glass
{"title":"Exploiting Convolutional Neural Networks for Phonotactic Based Dialect Identification","authors":"M. Najafian, Sameer Khurana, Suwon Shon, Ahmed Ali, James R. Glass","doi":"10.1109/ICASSP.2018.8461486","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461486","url":null,"abstract":"In this paper, we investigate different approaches for Dialect Identification (DID) in Arabic broadcast speech. Dialects differ in their inventory of phonological segments. This paper proposes a new phonotactic based feature representation approach which enables discrimination among different occurrences of the same phone n-grams with different phone duration and probability statistics. To achieve further gain in accuracy we used multi-lingual phone recognizers, trained separately on Arabic, English, Czech, Hungarian and Russian languages. We use Support Vector Machines (SVMs), and Convolutional Neural Networks (CNN s) as backend classifiers throughout the study. The final system fusion results in 24.7% and 19.0% relative error rate reduction compared to that of a conventional phonotactic DID, and i-vectors with bottleneck features.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"7 1","pages":"5174-5178"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82750562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
A Stem Reu Site on the Integrated Design of Sensor Devices and Signal Processing Algorithms 传感器件集成设计与信号处理算法研究
A. Spanias, J. Christen
{"title":"A Stem Reu Site on the Integrated Design of Sensor Devices and Signal Processing Algorithms","authors":"A. Spanias, J. Christen","doi":"10.1109/ICASSP.2018.8462483","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462483","url":null,"abstract":"Arizona State University (ASU) established an NSF Research Experiences for Undergraduates (REU) site to embed students in research projects related to integrated sensor and signal processing systems. The program includes both sensor hardware and algorithm/software design for a variety of applications including health monitoring. The site was funded in February 2017 and the Co-PIs recruited nine students from different universities and community colleges to spend the summer of 2017 in research laboratories at ASU. The program included structured training with modules in sensor design, signal processing, and machine learning. Cross-cutting training included research ethics, IEEE manuscript development, and building presentation skills. Nine undergraduate research projects were launched and the program went through an assessment by an independent evaluator. This paper describes the REU activities, modules, training, projects, and their assessment.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"73 1","pages":"6991-6995"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91121650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech 混合语言语音的端到端语言跟踪语音识别器
Hiroshi Seki, Shinji Watanabe, Takaaki Hori, Jonathan Le Roux, J. Hershey
{"title":"An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech","authors":"Hiroshi Seki, Shinji Watanabe, Takaaki Hori, Jonathan Le Roux, J. Hershey","doi":"10.1109/ICASSP.2018.8462180","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462180","url":null,"abstract":"End-to-end automatic speech recognition (ASR) can significantly reduce the burden of developing ASR systems for new languages, by eliminating the need for linguistic information such as pronunciation dictionaries. This also creates an opportunity to build a monolithic multilingual ASR system with a language-independent neural network architecture. In our previous work, we proposed a monolithic neural network architecture that can recognize multiple languages, and showed its effectiveness compared with conventional language-dependent models. However, the model is not guaranteed to properly handle switches in language within an utterance, thus lacking the flexibility to recognize mixed-language speech such as code-switching. In this paper, we extend our model to enable dynamic tracking of the language within an utterance, and propose a training procedure that takes advantage of a newly created mixed-language speech corpus. Experimental results show that the extended model outperforms both language-dependent models and our previous model without suffering from performance degradation that could be associated with language switching.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"54 1","pages":"4919-4923"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80791050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
Parametric Approximation of Piano Sound Based on Kautz Model with Sparse Linear Prediction 基于Kautz模型和稀疏线性预测的钢琴声音参数逼近
Kenji Kobayashi, Daiki Takeuchi, Mio Iwamoto, K. Yatabe, Yasuhiro Oikawa
{"title":"Parametric Approximation of Piano Sound Based on Kautz Model with Sparse Linear Prediction","authors":"Kenji Kobayashi, Daiki Takeuchi, Mio Iwamoto, K. Yatabe, Yasuhiro Oikawa","doi":"10.1109/ICASSP.2018.8461547","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461547","url":null,"abstract":"The piano is one of the most popular and attractive musical instruments that leads to a lot of research on it. To synthesize the piano sound in a computer, many modeling methods have been proposed from full physical models to approximated models. The focus of this paper is on the latter, approximating piano sound by an IIR filter. For stably estimating parameters, the Kautz model is chosen as the filter structure. Then, the selection of poles and excitation signal rises as the questions which are typical to the Kautz model that must be solved. In this paper, sparsity based construction of the Kautz model is proposed for approximating piano sound.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"39 1","pages":"626-630"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84635851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Variational Deep Learning for Low-Dose Computed Tomography 低剂量计算机断层扫描的变分深度学习
Erich Kobler, Matthew Muckley, Baiyu Chen, F. Knoll, K. Hammernik, T. Pock, D. Sodickson, R. Otazo
{"title":"Variational Deep Learning for Low-Dose Computed Tomography","authors":"Erich Kobler, Matthew Muckley, Baiyu Chen, F. Knoll, K. Hammernik, T. Pock, D. Sodickson, R. Otazo","doi":"10.1109/ICASSP.2018.8462312","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462312","url":null,"abstract":"In this work, we propose a learning-based variational network (VN) approach for reconstruction of low-dose 3D computed tomography data. We focus on two methods to decrease the radiation dose: (1) x-ray tube current reduction, which reduces the signal-to-noise ratio, and (2) x-ray beam interruption, which undersamples data and results in images with aliasing artifacts. While the learned VN denoises the current-reduced images in the first case, it reconstructs the undersampled data in the second case. Different VNs for denoising and reconstruction are trained on a single clinical 3D abdominal data set. The VNs are compared against state-of-the-art model-based denoising and sparse reconstruction techniques on a different clinical abdominal 3D data set with 4-fold dose reduction. Our results suggest that the proposed VNs enable higher radiation dose reductions and/or increase the image quality for a given dose.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"35 1","pages":"6687-6691"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91252897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Envelope Estimation by Tangentially Constrained Spline 切线约束样条包络估计
Tsubasa Kusano, K. Yatabe, Yasuhiro Oikawa
{"title":"Envelope Estimation by Tangentially Constrained Spline","authors":"Tsubasa Kusano, K. Yatabe, Yasuhiro Oikawa","doi":"10.1109/ICASSP.2018.8462203","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462203","url":null,"abstract":"Estimating envelope of a signal has various applications including empirical mode decomposition (EMD) in which the cubic $C^{2}$ -spline based envelope estimation is generally used. While such functional approach can easily control smoothness of an estimated envelope, the so-called undershoot problem often occurs that violates the basic requirement of envelope. In this paper, a tangentially constrained spline with tangential points optimization is proposed for avoiding the undershoot problem while maintaining smoothness. It is defined as a quartic $C^{2}$ -spline function constrained with first derivatives at tangential points that effectively avoids undershoot. The tangential points optimization method is proposed in combination with this spline to attain optimal smoothness of the estimated envelope.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"4374-4378"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83410798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning 利用广义倒谱特征和深度学习的自动音乐转录
Yu-Te Wu, Berlin Chen, Li Su
{"title":"Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning","authors":"Yu-Te Wu, Berlin Chen, Li Su","doi":"10.1109/ICASSP.2018.8462079","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462079","url":null,"abstract":"Spectral features are limited in modeling musical signals with multiple concurrent pitches due to the challenge to suppress the interference of the harmonic peaks from one pitch to another. In this paper, we show that using multiple features represented in both the frequency and time domains with deep learning modeling can reduce such interference. These features are derived systematically from conventional pitch detection functions that relate to one another through the discrete Fourier transform and a nonlinear scaling function. Neural networks modeled with these features outperform state-of-the-art methods while using less training data.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"401-405"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82007412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Joint Probabilistic Forecasts of Temperature and Solar Irradiance 温度和太阳辐照度的联合概率预报
Raksha Ramakrishna, A. Bernstein, E. Dall’Anese, A. Scaglione
{"title":"Joint Probabilistic Forecasts of Temperature and Solar Irradiance","authors":"Raksha Ramakrishna, A. Bernstein, E. Dall’Anese, A. Scaglione","doi":"10.1109/ICASSP.2018.8462496","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462496","url":null,"abstract":"In this paper, a mathematical relationship between temperature and solar irradiance is established in order to reduce the sample space and provide joint probabilistic forecasts. These forecasts can then be used for the purpose of stochastic optimization in power systems. A Volterra system type of model is derived to characterize the dependence of temperature on solar irradiance. A dataset from NOAA weather station in California is used to validate the fit of the model. Using the model, probabilistic forecasts of both temperature and irradiance are provided and the performance of the forecasting technique highlights the efficacy of the proposed approach. Results are indicative of the fact that the underlying correlation between temperature and irradiance is well captured and will therefore be useful to produce future scenarios of temperature and irradiance while approximating the underlying sample space appropriately.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"90 1","pages":"3819-3823"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78432381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Adaptive Bayesian Channel Gain Cartography 自适应贝叶斯信道增益制图
Donghoon Lee, Dimitris Berberidis, G. Giannakis
{"title":"Adaptive Bayesian Channel Gain Cartography","authors":"Donghoon Lee, Dimitris Berberidis, G. Giannakis","doi":"10.1109/ICASSP.2018.8461412","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8461412","url":null,"abstract":"Channel gain cartography relies on sensor measurements to construct maps providing the attenuation profile between arbitrary transmitter-receiver locations. Existing approaches capitalize on tomographic models, where shadowing is the weighted integral of a spatial loss field (SLF) depending on the propagation environment. Currently, the SLF is learned via regularization methods tailored to the propagation environment. However, the effectiveness of existing approaches remains unclear especially when the propagation environment involves heterogeneous characteristics. To cope with this, the present work considers a piecewise homogeneous SLF with a hidden Markov random field (MRF) model under the Bayesian framework. Efficient field estimators are obtained by using samples from Markov chain Monte Carlo (MCMC). Furthermore, an uncertainty sampling algorithm is developed to adaptively collect measurements. Real data tests demonstrate the capabilities of the novel approach.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"73 1","pages":"3554-3558"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74519068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Modal Decomposition of Musical Instrument Sound Via Alternating Direction Method of Multipliers 乘法器交替方向法的乐器声音模态分解
Yoshiki Masuyama, Tsubasa Kusano, K. Yatabe, Yasuhiro Oikawa
{"title":"Modal Decomposition of Musical Instrument Sound Via Alternating Direction Method of Multipliers","authors":"Yoshiki Masuyama, Tsubasa Kusano, K. Yatabe, Yasuhiro Oikawa","doi":"10.1109/ICASSP.2018.8462350","DOIUrl":"https://doi.org/10.1109/ICASSP.2018.8462350","url":null,"abstract":"For a musical instrument sound containing partials, or modes, the behavior of modes around the attack time is particularly important. However, accurately decomposing it around the attack time is not an easy task, especially when the onset is sharp. This is because spectra of the modes are peaky while the sharp onsets need a broad one. In this paper, an optimization-based method of modal decomposition is proposed to achieve accurate decomposition around the attack time. The proposed method is formulated as a constrained optimization problem to enforce the perfect reconstruction property which is important for accurate decomposition. For optimization, the alternating direction method of multipliers (ADMM) is utilized, where the update of variables is calculated in closed form. The proposed method realizes accurate modal decomposition in the simulation and real piano sounds.","PeriodicalId":6638,"journal":{"name":"2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"6 1","pages":"631-635"},"PeriodicalIF":0.0,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89046970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信