Eurasip Journal on Audio Speech and Music Processing最新文献

筛选
英文 中文
Articulation constrained learning with application to speech emotion recognition. 发音约束学习在语音情感识别中的应用。
IF 2.4 3区 计算机科学
Eurasip Journal on Audio Speech and Music Processing Pub Date : 2019-01-01 Epub Date: 2019-08-20 DOI: 10.1186/s13636-019-0157-9
Mohit Shah, Ming Tu, Visar Berisha, Chaitali Chakrabarti, Andreas Spanias
{"title":"Articulation constrained learning with application to speech emotion recognition.","authors":"Mohit Shah,&nbsp;Ming Tu,&nbsp;Visar Berisha,&nbsp;Chaitali Chakrabarti,&nbsp;Andreas Spanias","doi":"10.1186/s13636-019-0157-9","DOIUrl":"https://doi.org/10.1186/s13636-019-0157-9","url":null,"abstract":"<p><p>Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may not be feasible in many scenarios, thus restricting the scope and applicability of such methods. In this paper, a discriminative learning method for emotion recognition using both articulatory and acoustic information is proposed. A traditional <i>ℓ</i> <sub>1</sub>-regularized logistic regression cost function is extended to include additional constraints that enforce the model to reconstruct articulatory data. This leads to sparse and interpretable representations jointly optimized for both tasks simultaneously. Furthermore, the model only requires articulatory features during training; only speech features are required for inference on out-of-sample data. Experiments are conducted to evaluate emotion recognition performance over vowels <i>/AA/,/AE/,/IY/,/UW/</i> and complete utterances. Incorporating articulatory information is shown to significantly improve the performance for valence-based classification. Results obtained for within-corpus and cross-corpus categorical emotion recognition indicate that the proposed method is more effective at distinguishing happiness from other emotions.</p>","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"2019 ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13636-019-0157-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37471483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass 从原始音频到无缝混合:为鼓和贝斯创建一个自动化的DJ系统
IF 2.4 3区 计算机科学
Eurasip Journal on Audio Speech and Music Processing Pub Date : 2018-09-24 DOI: 10.1186/s13636-018-0134-8
Len Vande Veire, Tijl De Bie
{"title":"From raw audio to a seamless mix: creating an automated DJ system for Drum and Bass","authors":"Len Vande Veire, Tijl De Bie","doi":"10.1186/s13636-018-0134-8","DOIUrl":"https://doi.org/10.1186/s13636-018-0134-8","url":null,"abstract":"","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"98 1","pages":""},"PeriodicalIF":2.4,"publicationDate":"2018-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73628910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases. 在孤立音符和独奏乐句中识别乐器的仿生光谱-时间特征。
IF 2.4 3区 计算机科学
Eurasip Journal on Audio Speech and Music Processing Pub Date : 2015-01-01 DOI: 10.1186/s13636-015-0070-9
Kailash Patil, Mounya Elhilali
{"title":"Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases.","authors":"Kailash Patil,&nbsp;Mounya Elhilali","doi":"10.1186/s13636-015-0070-9","DOIUrl":"https://doi.org/10.1186/s13636-015-0070-9","url":null,"abstract":"<p><p>The identity of musical instruments is reflected in the acoustic attributes of musical notes played with them. Recently, it has been argued that these characteristics of musical identity (or timbre) can be best captured through an analysis that encompasses both time and frequency domains; with a focus on the modulations or changes in the signal in the spectrotemporal space. This representation mimics the spectrotemporal receptive field (STRF) analysis believed to underlie processing in the central mammalian auditory system, particularly at the level of primary auditory cortex. How well does this STRF representation capture timbral identity of musical instruments in continuous solo recordings remains unclear. The current work investigates the applicability of the STRF feature space for instrument recognition in solo musical phrases and explores best approaches to leveraging knowledge from isolated musical notes for instrument recognition in solo recordings. The study presents an approach for parsing solo performances into their individual note constituents and adapting back-end classifiers using support vector machines to achieve a generalization of instrument recognition to off-the-shelf, commercially available solo music.</p>","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"2015 ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s13636-015-0070-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36776486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Biomimetic multi-resolution analysis for robust speaker recognition. 鲁棒说话人识别的仿生多分辨率分析。
IF 2.4 3区 计算机科学
Eurasip Journal on Audio Speech and Music Processing Pub Date : 2012-01-01 Epub Date: 2012-09-07 DOI: 10.1186/1687-4722-2012-22
Sridhar Krishna Nemala, Dmitry N Zotkin, Ramani Duraiswami, Mounya Elhilali
{"title":"Biomimetic multi-resolution analysis for robust speaker recognition.","authors":"Sridhar Krishna Nemala,&nbsp;Dmitry N Zotkin,&nbsp;Ramani Duraiswami,&nbsp;Mounya Elhilali","doi":"10.1186/1687-4722-2012-22","DOIUrl":"https://doi.org/10.1186/1687-4722-2012-22","url":null,"abstract":"<p><p>Humans exhibit a remarkable ability to reliably classify sound sources in the environment even in presence of high levels of noise. In contrast, most engineering systems suffer a drastic drop in performance when speech signals are corrupted with channel or background distortions. Our brains are equipped with elaborate machinery for speech analysis and feature extraction, which hold great lessons for improving the performance of automatic speech processing systems under adverse conditions. The work presented here explores a biologically-motivated multi-resolution speaker information representation obtained by performing an intricate yet computationally-efficient analysis of the information-rich spectro-temporal attributes of the speech signal. We evaluate the proposed features in a speaker verification task performed on NIST SRE 2010 data. The biomimetic approach yields significant robustness in presence of non-stationary noise and reverberation, offering a new framework for deriving reliable features for speaker recognition and speech processing.</p>","PeriodicalId":49202,"journal":{"name":"Eurasip Journal on Audio Speech and Music Processing","volume":"2012 ","pages":""},"PeriodicalIF":2.4,"publicationDate":"2012-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1687-4722-2012-22","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36781151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信