Towards realizing gesture-to-speech conversion with a HMM-based bilingual speech synthesis system

Hongwu Yang, Xiaochun An, Dong Pei, Yitong Liu
{"title":"Towards realizing gesture-to-speech conversion with a HMM-based bilingual speech synthesis system","authors":"Hongwu Yang, Xiaochun An, Dong Pei, Yitong Liu","doi":"10.1109/ICOT.2014.6956608","DOIUrl":null,"url":null,"abstract":"This paper realizes a gesture-to-speech conversion system to solve the communication problem between healthy people and speech disorders. An improved speeded up robust features (SURF) algorithm is adopted for static gesture recognition by combining Kinect sensor. Meanwhile, a Hidden Markov Model (HMM) based Mandarin-Tibetan bilingual speech synthesis system is developed by using speaker adaptive training. A set of semantic rules is designed for the static gestures. Chinese or Tibetan context-dependent labels of recognized static gestures are generated according to the semantic rules. The recognized gestures are finally converted to the Mandarin or Tibetan by using the Mandarin-Tibetan bilingual speech synthesis system with the context-dependent labels. Tests show that the static gesture recognition rate of the designed system achieves 97.1%. Subjective evaluation demonstrates that synthesized speech can get 4.0 of the mean opinion score (MOS) on synthesized speech.","PeriodicalId":343641,"journal":{"name":"2014 International Conference on Orange Technologies","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Orange Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOT.2014.6956608","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

This paper realizes a gesture-to-speech conversion system to solve the communication problem between healthy people and speech disorders. An improved speeded up robust features (SURF) algorithm is adopted for static gesture recognition by combining Kinect sensor. Meanwhile, a Hidden Markov Model (HMM) based Mandarin-Tibetan bilingual speech synthesis system is developed by using speaker adaptive training. A set of semantic rules is designed for the static gestures. Chinese or Tibetan context-dependent labels of recognized static gestures are generated according to the semantic rules. The recognized gestures are finally converted to the Mandarin or Tibetan by using the Mandarin-Tibetan bilingual speech synthesis system with the context-dependent labels. Tests show that the static gesture recognition rate of the designed system achieves 97.1%. Subjective evaluation demonstrates that synthesized speech can get 4.0 of the mean opinion score (MOS) on synthesized speech.
基于hmm的双语语音合成系统实现手势到语音的转换
本文实现了一种手势语言转换系统,解决了健康人与语言障碍者之间的交流问题。结合Kinect传感器,采用改进的加速鲁棒特征(SURF)算法进行静态手势识别。同时,利用说话人自适应训练,开发了基于隐马尔可夫模型的汉藏双语语音合成系统。为静态手势设计了一套语义规则。根据语义规则生成识别静态手势的中文或藏文上下文相关标签。利用基于语境标签的汉藏双语语音合成系统,将识别出的手势转换为汉语或藏语。测试表明,所设计系统的静态手势识别率达到97.1%。主观评价表明,合成语音的平均意见得分(MOS)达到4.0分。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信