Workshop on Speech, Music and Mind (SMM 2018)最新文献

筛选
英文 中文
Time-frequency spectral error for analysis of high arousal speech 高唤醒语音分析的时频误差
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/SMM.2018-4
P. Gangamohan, S. Gangashetty, B. Yegnanarayana
{"title":"Time-frequency spectral error for analysis of high arousal speech","authors":"P. Gangamohan, S. Gangashetty, B. Yegnanarayana","doi":"10.21437/SMM.2018-4","DOIUrl":"https://doi.org/10.21437/SMM.2018-4","url":null,"abstract":"High arousal speech is produced by speakers when they raise their loudness levels. There are deviations from neutral speech, especially in the excitation component of the speech production mechanism in the high arousal mode. In this study, a parameter, called the time-frequency spectral error (TFe) is derived using the single frequency filtering (SFF) spectrogram. It is used to characterize the high arousal regions in speech signals. The proposed parameter captures the fine temporal and spectral variations due to changes in the excitation source.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"75 s321","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113953560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A component-based approach to study the effect of Indian music on emotions 以成分为基础的方法研究印度音乐对情绪的影响
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/SMM.2018-7
V. Viraraghavan, A. Pal, H. Murthy, R. Aravind
{"title":"A component-based approach to study the effect of Indian music on emotions","authors":"V. Viraraghavan, A. Pal, H. Murthy, R. Aravind","doi":"10.21437/SMM.2018-7","DOIUrl":"https://doi.org/10.21437/SMM.2018-7","url":null,"abstract":"The emotional impact of Indian music on human listeners has been studied mainly with respect to ragas. Although this approach aligns with the traditional and musicological views, some studies show that raga-specific effects may not be consistent. In this paper, we propose an alternative method of study based on the components of Indian Classical Music, which may be viewed as consisting of constant-pitch notes (CPNs) provid-ing the context, and transients, the detail. One hundred concert pieces in four ragas each in Carnatic music (CM) and Hindustani music (HM) are analyzed to show that the transients are, on average, longer than CPNs. Further, the defined scale of the raga is not always mirrored in the CPNs for CM. We also draw upon the result that CPNs and transients scale non-uniformly when changing the tempo of CM pieces. Based on the observations and previous results on the emotional impact of the major and minor scales in Western music, we propose that the effect of CPNs and transients should be analyzed separately. We present a preliminary experiment that brings outs related challenges.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115339411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Analysis of Speech Emotions in Realistic Environments 现实环境中的言语情绪分析
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/smm.2018-3
B. Sarma, Rohan Kumar Das, Abhishek Dey, Risto Haukioja
{"title":"Analysis of Speech Emotions in Realistic Environments","authors":"B. Sarma, Rohan Kumar Das, Abhishek Dey, Risto Haukioja","doi":"10.21437/smm.2018-3","DOIUrl":"https://doi.org/10.21437/smm.2018-3","url":null,"abstract":"The classification of emotional speech is a challenging task and it depends critically on the correctness of labeled data. Most of the databases used for research purposes are either acted or simulated. Annotation of such acted database is easier as the actor exaggerates the emotions. On the other hand, emotion labeling on real-world data is very difficult due to confusion among the emotion classes. Another problem in such scenario is the class imbalance, because most of the data is found to be neutral in realistic environment. In this study, we perform emotion labeling on realistic data in a customized manner using emotion priority and confidence level. The annotated speech corpus is then used for analysis and study. Percentage distribution of different emotion classes in the real-world data and the confusions between the emotions during labeling are presented.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115924913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Emotional Speech Classifier Systems: For Sensitive Assistance to support Disabled Individuals 情感语音分类系统:用于支持残疾人的敏感援助
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/SMM.2018-2
V. V. Raju, P. Jain, K. Gurugubelli, A. Vuppala
{"title":"Emotional Speech Classifier Systems: For Sensitive Assistance to support Disabled Individuals","authors":"V. V. Raju, P. Jain, K. Gurugubelli, A. Vuppala","doi":"10.21437/SMM.2018-2","DOIUrl":"https://doi.org/10.21437/SMM.2018-2","url":null,"abstract":"This paper provides the classification of emotionally annotated speech of mentally impaired people. The main problem encoun-tered in the classification task is the class-imbalance. This imbalance is due to the availability of large number of speech samples for the neutral speech compared to other emotional speech. Different sampling methodologies are explored at the back-end to handle this class-imbalance problem. Mel-frequency cepstral coefficients (MFCCs) features are considered at the front-end, deep neural networks (DNNs) and gradient boosted decision trees (GBDT) are investigated at the back-end as classifiers. The experimental results obtained from the EmotAsS dataset have shown higher classification accuracy and Unweighted Average Recall (UAR) scores over the baseline system.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116317089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Task-Independent EEG based Subject Identification using Auditory Stimulus 基于听觉刺激的任务无关脑电主体识别
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/SMM.2018-6
D. Vinothkumar, Mari Ganesh Kumar, Abhishek Kumar, H. Gupta, S. SaranyaM, M. Sur, H. Murthy
{"title":"Task-Independent EEG based Subject Identification using Auditory Stimulus","authors":"D. Vinothkumar, Mari Ganesh Kumar, Abhishek Kumar, H. Gupta, S. SaranyaM, M. Sur, H. Murthy","doi":"10.21437/SMM.2018-6","DOIUrl":"https://doi.org/10.21437/SMM.2018-6","url":null,"abstract":"Recent studies have shown that task-specific electroencephalography (EEG) can be used as a reliable biometric. This paper extends this study to task-independent EEG with auditory stimuli. Data collected from 40 subjects in response to various types of audio stimuli, using a 128 channel EEG system is presented to different classifiers, namely, k-nearest neighbor (k-NN), arti-ficial neural network (ANN) and universal background model - Gaussian mixture model (UBM-GMM). It is observed that k-NN and ANN perform well when testing is performed intrasession, while UBM-GMM framework is more robust when testing is performed intersession. This can be attributed to the fact that the correspondence of the sensor locations across sessions is only approximate. It is also observed that EEG from parietal and temporal regions contain more subject information although the performance using all the 128 channel data is marginally better.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122006855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Discriminating between High-Arousal and Low-Arousal Emotional States of Mind using Acoustic Analysis 用声学分析区分高唤醒和低唤醒的情绪状态
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-09-01 DOI: 10.21437/SMM.2018-1
Esther Ramdinmawii, V. K. Mittal
{"title":"Discriminating between High-Arousal and Low-Arousal Emotional States of Mind using Acoustic Analysis","authors":"Esther Ramdinmawii, V. K. Mittal","doi":"10.21437/SMM.2018-1","DOIUrl":"https://doi.org/10.21437/SMM.2018-1","url":null,"abstract":"Identification of emotions from human speech can be attempted by focusing upon three aspects of emotional speech: valence, arousal and dominance. In this paper, changes in the production characteristics of emotional speech are examined to discriminate between the high-arousal and low-arousal emotions, and amongst emotions within each of these categories. Basic emotions anger, happy and fear are examined in high-arousal, and neutral speech and sad emotion in low-arousal emotional speech. Discriminating changes are examined first in the excitation source characteristics, i.e., instantaneous fundamental frequency (F0) derived using the zero-frequency filtering (ZFF) method. Differences observed in the spectrograms are then validated by examining changes in the combined characteristics of the source and the vocal tract filter, i.e., strength of excitation (SoE), derived using ZFF method, and signal energy features. Emotions within each category are distinguished by examining changes in two scarcely explored discriminating features, namely, zero-crossing rate and the ratios amongst the spectral sub-band energies computed using short-time Fourier transform. Effectiveness of these features in discriminating emotions is validated using two emotion databases, Berlin EMO-DB (German) and IIT-KGP-SESC (Telugu). Proposed features exhibit highly encouraging results in discriminating these emotions. This study can be helpful towards automatic classification of emotions from speech.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121742227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
CNN+LSTM Architecture for Speech Emotion Recognition with Data Augmentation 基于CNN+LSTM架构的数据增强语音情感识别
Workshop on Speech, Music and Mind (SMM 2018) Pub Date : 2018-02-15 DOI: 10.21437/SMM.2018-5
Caroline Etienne, Guillaume Fidanza, Andrei Petrovskii, L. Devillers, B. Schmauch
{"title":"CNN+LSTM Architecture for Speech Emotion Recognition with Data Augmentation","authors":"Caroline Etienne, Guillaume Fidanza, Andrei Petrovskii, L. Devillers, B. Schmauch","doi":"10.21437/SMM.2018-5","DOIUrl":"https://doi.org/10.21437/SMM.2018-5","url":null,"abstract":"In this work we design a neural network for recognizing emotions in speech, using the IEMOCAP dataset. Following the latest advances in audio analysis, we use an architecture involving both convolutional layers, for extracting high-level features from raw spectrograms, and recurrent ones for aggregating long-term dependencies. We examine the techniques of data augmentation with vocal track length perturbation, layer-wise optimizer adjustment, batch normalization of recurrent layers and obtain highly competitive results of 64.5% for weighted accuracy and 61.7% for unweighted accuracy on four emotions.","PeriodicalId":158743,"journal":{"name":"Workshop on Speech, Music and Mind (SMM 2018)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114667222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 78
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信