The Relation of Eye Gaze and Face Pose: Potential Impact on Speech Recognition

Proceedings of the 16th International Conference on Multimodal Interaction Pub Date : 2014-11-12 DOI:10.1145/2663204.2663251

M. Slaney, A. Stolcke, Dilek Z. Hakkani-Tür

引用次数: 15

Abstract

We are interested in using context to improve speech recognition and speech understanding. Knowing what the user is attending to visually helps us predict their utterances and thus makes speech recognition easier. Eye gaze is one way to access this signal, but is often unavailable (or expensive to gather) at longer distances. In this paper we look at joint eye-gaze and facial-pose information while users perform a speech reading task. We hypothesize, and verify experimentally, that the eyes lead, and then the face follows. Face pose might not be as fast, or as accurate a signal of visual attention as eye gaze, but based on experiments correlating eye gaze with speech recognition, we conclude that face pose provides useful information to bias a recognizer toward higher accuracy.

查看原文本刊更多论文

眼睛注视与面部姿势的关系:对语音识别的潜在影响

我们感兴趣的是使用上下文来提高语音识别和语音理解。从视觉上了解用户正在关注的内容有助于我们预测他们的话语，从而使语音识别变得更加容易。眼睛凝视是获取这种信号的一种方式，但在较远的距离通常不可用(或采集成本高昂)。在本文中，我们研究了当用户执行语音阅读任务时的联合眼睛注视和面部姿势信息。我们假设，并通过实验验证，眼睛引导，然后是脸。面部姿势可能没有眼睛注视那么快，或者没有眼睛注视那么准确，但是基于将眼睛注视与语音识别相关联的实验，我们得出结论，面部姿势提供了有用的信息，使识别器倾向于更高的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th International Conference on Multimodal Interaction

自引率

0.00%

发文量