词和音节级信息对表达性语音处理的意义

K. S. Rao, S. Prasanna, T. V. Sagar
{"title":"词和音节级信息对表达性语音处理的意义","authors":"K. S. Rao, S. Prasanna, T. V. Sagar","doi":"10.1109/ICAPR.2009.47","DOIUrl":null,"url":null,"abstract":"In general, human beings make use of expressions (emotions) through speech, facial movements and gestures for conveying the crucial information. Mostly, expressions in speech can be attributed to longer segments, i.e., suprasegmental features also known to be prosodic features. In this paper we analyze the expressions in speech using prosodic features from utterance level, word level and syllable level. The emotions considered for the analysis are anger,compassion, happy and neutral. The prosodic features used in the analysis are duration, intonation (pitch) and energy. The analysis is performed on SUSE (Speech Under Simulated Emotion) database. The results of the analysis are used for synthesizing the expressions in neutral speech. The synthesis experiments using the features from utterance level to syllable level showed that a steady improvement in the quality of speech for the desired expressions.","PeriodicalId":443926,"journal":{"name":"2009 Seventh International Conference on Advances in Pattern Recognition","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Significance of Word and Syllable Level Information for Expressive Speech Processing\",\"authors\":\"K. S. Rao, S. Prasanna, T. V. Sagar\",\"doi\":\"10.1109/ICAPR.2009.47\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In general, human beings make use of expressions (emotions) through speech, facial movements and gestures for conveying the crucial information. Mostly, expressions in speech can be attributed to longer segments, i.e., suprasegmental features also known to be prosodic features. In this paper we analyze the expressions in speech using prosodic features from utterance level, word level and syllable level. The emotions considered for the analysis are anger,compassion, happy and neutral. The prosodic features used in the analysis are duration, intonation (pitch) and energy. The analysis is performed on SUSE (Speech Under Simulated Emotion) database. The results of the analysis are used for synthesizing the expressions in neutral speech. The synthesis experiments using the features from utterance level to syllable level showed that a steady improvement in the quality of speech for the desired expressions.\",\"PeriodicalId\":443926,\"journal\":{\"name\":\"2009 Seventh International Conference on Advances in Pattern Recognition\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-02-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Seventh International Conference on Advances in Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAPR.2009.47\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Seventh International Conference on Advances in Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAPR.2009.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

一般来说,人类通过语言、面部动作和手势来利用表情(情感)来传达关键信息。大多数情况下,言语中的表达可以归因于较长的片段,即超片段特征,也称为韵律特征。本文从话语层次、词层次和音节层次三个方面分析了语音中的韵律特征。用于分析的情绪包括愤怒、同情、快乐和中性。分析中使用的韵律特征是音长、语调(音高)和能量。在SUSE (Speech Under simulation Emotion)数据库上进行分析。分析结果可用于合成中性言语中的表达。从话语层面到音节层面的特征综合实验表明,期望表达的语音质量稳步提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Significance of Word and Syllable Level Information for Expressive Speech Processing
In general, human beings make use of expressions (emotions) through speech, facial movements and gestures for conveying the crucial information. Mostly, expressions in speech can be attributed to longer segments, i.e., suprasegmental features also known to be prosodic features. In this paper we analyze the expressions in speech using prosodic features from utterance level, word level and syllable level. The emotions considered for the analysis are anger,compassion, happy and neutral. The prosodic features used in the analysis are duration, intonation (pitch) and energy. The analysis is performed on SUSE (Speech Under Simulated Emotion) database. The results of the analysis are used for synthesizing the expressions in neutral speech. The synthesis experiments using the features from utterance level to syllable level showed that a steady improvement in the quality of speech for the desired expressions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信