Speaker dependent visual word recognition by using sequential mouth shape codes

Takuro Tasaka, N. Hamada
{"title":"Speaker dependent visual word recognition by using sequential mouth shape codes","authors":"Takuro Tasaka, N. Hamada","doi":"10.1109/ISPACS.2012.6473460","DOIUrl":null,"url":null,"abstract":"Visual speech recognition or lip reading is an approach for noise robust speech recognition by adding speaker's visual cues to audio information. Basically visual-only speech recognition is applicable to speaker verification and multimedia interface for supporting speaking impaired person. The sequential mouth-shape code method is an effective approach of lip reading for particularly uttered Japanese words by utilizing two kinds of distinctive mouth shapes, known as first and last mouth shapes, appeared intermittently. One advantage of this method is its low computational burden for the learning and word registration processes. This paper proposes a novel word lip recognition system by detecting and determining initial mouth-shape codes to recognize uttering consonants. The proposed method eventually is able to discriminate different words consisting of the same sequential vowel codes though containing different consonant codes. The conducted experiments demonstrate that the proposed system provides higher recognition rate than the conventional ones.","PeriodicalId":158744,"journal":{"name":"2012 International Symposium on Intelligent Signal Processing and Communications Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Symposium on Intelligent Signal Processing and Communications Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPACS.2012.6473460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Visual speech recognition or lip reading is an approach for noise robust speech recognition by adding speaker's visual cues to audio information. Basically visual-only speech recognition is applicable to speaker verification and multimedia interface for supporting speaking impaired person. The sequential mouth-shape code method is an effective approach of lip reading for particularly uttered Japanese words by utilizing two kinds of distinctive mouth shapes, known as first and last mouth shapes, appeared intermittently. One advantage of this method is its low computational burden for the learning and word registration processes. This paper proposes a novel word lip recognition system by detecting and determining initial mouth-shape codes to recognize uttering consonants. The proposed method eventually is able to discriminate different words consisting of the same sequential vowel codes though containing different consonant codes. The conducted experiments demonstrate that the proposed system provides higher recognition rate than the conventional ones.
基于顺序口型代码的说话人依赖视觉词识别
视觉语音识别或唇读是一种将说话人的视觉线索添加到音频信息中的抗噪声语音识别方法。视觉语音识别基本上适用于说话人的验证和支持说话障碍者的多媒体界面。顺序口型编码法是一种有效的唇读方法,它利用了两种不同的口型,即第一个和最后一个口型,间歇性地出现。该方法的一个优点是其低计算负担的学习和单词注册过程。本文提出了一种通过检测和确定初始口型编码来识别辅音的唇形识别系统。所提出的方法最终能够区分由相同顺序元音代码组成的不同单词,尽管包含不同的辅音代码。实验结果表明,该系统具有较高的识别率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信