Prosody based co-analysis for continuous recognition of coverbal gestures

Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI:10.1109/ICMI.2002.1166986

S. Kettebekov, M. Yeasin, Rajeev Sharma

{"title":"Prosody based co-analysis for continuous recognition of coverbal gestures","authors":"S. Kettebekov, M. Yeasin, Rajeev Sharma","doi":"10.1109/ICMI.2002.1166986","DOIUrl":null,"url":null,"abstract":"Although recognition of natural speech and gestures have been studied extensively, previous attempts at combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were co-analyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating small hand movements, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected front the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMI.2002.1166986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

Although recognition of natural speech and gestures have been studied extensively, previous attempts at combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were co-analyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating small hand movements, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected front the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.

查看原文本刊更多论文

基于韵律的手势连续识别协同分析

尽管对自然语音和手势的识别已经进行了广泛的研究，但之前将它们结合在一个统一的框架中以促进分类的尝试大多是语义动机，例如关键字-手势共现。这种表述继承了自然语言处理的复杂性。本文提出了一种利用手势和语音发音现象来提高连续手势自动识别精度的贝叶斯公式。将语音信号的韵律特征与视觉信号进行联合分析，学习突出的语音片段与手势的特定运动相位共现的先验概率。研究发现，上述联合分析有助于检测和消除手势动作的歧义，从而提高连续手势识别的速率。在天气频道广播前收集的大型数据库上验证了该方法的有效性。这种表述为自下而上的多模式集成框架开辟了新的途径。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings. Fourth IEEE International Conference on Multimodal Interfaces

自引率

0.00%

发文量