Proceedings. Fourth IEEE International Conference on Multimodal Interfaces最新文献

筛选
英文 中文
Gesture patterns during speech repairs 语音修复过程中的手势模式
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166985
L. Chen, M. Harper, Francis K. H. Quek
{"title":"Gesture patterns during speech repairs","authors":"L. Chen, M. Harper, Francis K. H. Quek","doi":"10.1109/ICMI.2002.1166985","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166985","url":null,"abstract":"Speech and gesture are two primary modes used in natural human communication; hence, they are important inputs for a multimodal interface to process. One of the challenges for multimodal interfaces is to accurately recognize the words in spontaneous speech. This is partly due to the presence of speech repairs, which seriously degrade the accuracy of current speech recognition systems. Based on the assumption that speech and gesture arise from the same thought process, we would expect to find patterns of gesture that co-occur with speech repairs that can be exploited by a multimodal processing system to more effectively process spontaneous speech. To evaluate this hypothesis, we have conducted a measurement study of gesture and speech repair data extracted from videotapes of natural dialogs. Although we have found that gestures do not always co-occur with speech repairs, we observed that modification gesture patterns have a high correlation with content replacement speech repairs, but rarely occur with content repetitions. These results suggest that gesture patterns can help us to classify different types of speech repairs in order to correct them more accurately.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116624777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
The role of gesture in multimodal referring actions 手势在多模态指涉动作中的作用
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166988
Frédéric Landragin
{"title":"The role of gesture in multimodal referring actions","authors":"Frédéric Landragin","doi":"10.1109/ICMI.2002.1166988","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166988","url":null,"abstract":"When deictic gestures are produced on a touch screen, they can take forms which can lead to several sorts of ambiguities. Considering that the resolution of a multimodal reference requires the identification of the referents and of the context (\"reference domain\") from which these referents are extracted, we focus on the linguistic, gestural, and visual clues that a dialogue system may exploit to comprehend the referring intention. We explore the links between words, gestures and perceptual groups, doing so in terms of the clues that delimit the reference domain. We also show the importance of taking the domain into account for dialogue management, particularly for the comprehension of further utterances, when they seem to implicitly use a pre-existing restriction to a subset of objects. We propose a strategy of multimodal reference resolution based on this notion of reference domain, and we illustrate its efficiency with prototypic examples built from a study of significant referring situations extracted from a corpus. We also present the future directions of our works, concerning some linguistic and task aspects that are not integrated here.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129598788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Active gaze tracking for human-robot interaction 人机交互的主动注视跟踪
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167004
Rowel Atienza, A. Zelinsky
{"title":"Active gaze tracking for human-robot interaction","authors":"Rowel Atienza, A. Zelinsky","doi":"10.1109/ICMI.2002.1167004","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167004","url":null,"abstract":"In our effort to make human-robot interfaces more user-friendly, we built an active gaze tracking system that can measure a person's gaze direction in real-time. Gaze normally tells which object in his/her surrounding a person is interested in. Therefore, it can be used as a medium for human-robot interaction like instructing a robot arm to pick a certain object a user is looking at. We discuss how we developed and put together algorithms for zoom camera calibration, low-level control of active head, face and gaze tracking to create an active gaze tracking system.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130730818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Designing transition networks for multimodal VR-interactions using a markup language 使用标记语言设计多模态vr交互的转换网络
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167030
Marc Erich Latoschik
{"title":"Designing transition networks for multimodal VR-interactions using a markup language","authors":"Marc Erich Latoschik","doi":"10.1109/ICMI.2002.1167030","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167030","url":null,"abstract":"This article presents one core component for enabling multimodal-speech and gesture-driven interaction in and for virtual environments. A so-called temporal Augmented Transition Network (tATN) is introduced. It allows to integrate and evaluate information from speech, gesture, and a given application context using a combined syntactic/semantic parse approach. This tATN represents the target structure for a multimodal integration markup language (MIML). MIML centers around the specification of multimodal interactions by letting an application designer declare temporal and semantic relations between given input utterance percepts and certain application states in a declarative and portable manner. A subsequent parse pass translates MIML into corresponding tATNs which are directly loaded and executed by a simulation engines scripting facility.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131199249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
State sharing in a hybrid neuro-Markovian on-line handwriting recognition system through a simple hierarchical clustering algorithm 基于简单层次聚类算法的混合神经-马尔可夫在线手写识别系统状态共享
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166993
Haifeng Li, T. Artières, P. Gallinari
{"title":"State sharing in a hybrid neuro-Markovian on-line handwriting recognition system through a simple hierarchical clustering algorithm","authors":"Haifeng Li, T. Artières, P. Gallinari","doi":"10.1109/ICMI.2002.1166993","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166993","url":null,"abstract":"HMM has been largely applied in many fields with great success. To achieve a better performance, an easy way is using more states or more free parameters for a better signal modelling. Thus, state sharing and state clipping methods have been proposed to reduce parameter redundancy and to limit the explosive consummation of system resources. We focus on a simple state sharing method for a hybrid neuro-Markovian on-line handwriting recognition system. At first, a likelihood-based distance is proposed for measuring the similarity between two HMM state models. Afterwards, a minimum quantification error aimed hierarchical clustering algorithm is also proposed to select the most representative models. Here, models are shared to the most under the constraint of the minimum system performance loss. As the result, we maintain about 98% of the system performance while about 60% of the parameters are reduced.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"250 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133516443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Covariance-tied clustering method in speaker identification 说话人识别中的协方差聚类方法
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166973
Ziqiang Wang, Yang Liu, Peng Ding, Bo Xu
{"title":"Covariance-tied clustering method in speaker identification","authors":"Ziqiang Wang, Yang Liu, Peng Ding, Bo Xu","doi":"10.1109/ICMI.2002.1166973","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166973","url":null,"abstract":"Gaussian mixture models (GMMs) have been successfully applied to the classifier for speaker modeling in speaker identification. However, there are still problems to solve, such as the clustering methods. The conditional k-means algorithm utilizes Euclidean distance taking all data distribution as sphericity, which is not the distribution of the actual data. In this paper we present a new method making use of covariance information to direct the clustering of GMMs, namely covariance-tied clustering. This method consists of two parts: obtaining covariance matrices using the data sharing technique based on a binary tree, and making use of covariance matrices to direct clustering. The experimental results prove that this method leads to worthwhile reductions of error rates in speaker identification. Much remains to be done to explore fully the covariance information.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132102866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Towards visually-grounded spoken language acquisition 走向以视觉为基础的口语习得
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166977
D. Roy
{"title":"Towards visually-grounded spoken language acquisition","authors":"D. Roy","doi":"10.1109/ICMI.2002.1166977","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166977","url":null,"abstract":"A characteristic shared by most approaches to natural language understanding and generation is the use of symbolic representations of word and sentence meanings. Frames and semantic nets are examples of symbolic representations. Symbolic methods are inappropriate for applications which require natural language semantics to be linked to perception, as is the case in tasks such as scene description or human-robot interaction. This paper presents two implemented systems, one that learns to generate, and one that learns to understand visually-grounded spoken language. These implementations are part of our on-going effort to develop a comprehensive model of perceptually-grounded semantics.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116269884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A probabilistic dynamic contour model for accurate and robust lip tracking 一种精确鲁棒唇形跟踪的概率动态轮廓模型
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167007
Qiang Wang, H. Ai, Guangyou Xu
{"title":"A probabilistic dynamic contour model for accurate and robust lip tracking","authors":"Qiang Wang, H. Ai, Guangyou Xu","doi":"10.1109/ICMI.2002.1167007","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167007","url":null,"abstract":"In this paper a new condensation style contour tracking method called probabilistic dynamic contour (PDC) is proposed for lip tracking: a novel mixture dynamic model is designed to represent shape more compactly and to tolerate larger motions between frames, a measurement model is designed to include multiple visual cues. The proposed PDC tracker has the advantage that it is conceptually general but effectively suitable for lip tracking with the designed dynamic and measurement model. The new tracker improves the traditional condensation style tracker in three aspects: Firstly, the dynamic model is partially derived from the image sequence, so the tracker does not need to learn the dynamics in advance. Secondly, the measurement model is easy to be updated during tracking, which avoids modeling the foreground object in prior. Thirdly, to improve the tracker's speed, a compact representation of shape and a noise model are proposed to reduce the samples required to represent the posterior distribution. An experiment on lip contour tracking shows that the proposed method tracks contours robustly as well as accurately compared to the existing tracking method.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128398820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Techniques for interactive audience participation 交互式观众参与技术
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166962
Dan Maynes-Aminzade, R. Pausch, S. Seitz
{"title":"Techniques for interactive audience participation","authors":"Dan Maynes-Aminzade, R. Pausch, S. Seitz","doi":"10.1109/ICMI.2002.1166962","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166962","url":null,"abstract":"At SIGGRAPH in 1991, Loren and Rachel Carpenter unveiled an interactive entertainment system that allowed members of a large audience to control an onscreen game using red and green reflective paddles. In the spirit of this approach, we present a new set of techniques that enable members of an audience to participate, either cooperatively or competitively, in shared entertainment experiences. Our techniques allow audiences with hundreds of people to control onscreen activity by (1) leaning left and right in their seats, (2) batting a beach ball while its shadow is used as a pointing device, and (3) pointing laser pointers at the screen. All of these techniques can be implemented with inexpensive, off the shelf hardware. Me have tested these techniques with a variety of audiences; in this paper we describe both the computer vision based implementation and the lessons we learned about designing effective content for interactive audience participation.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134618263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
An improved active shape model for face alignment 一种改进的面对齐主动形状模型
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167050
Wei Wang, S. Shan, Wen Gao, B. Cao, Baocai Yin
{"title":"An improved active shape model for face alignment","authors":"Wei Wang, S. Shan, Wen Gao, B. Cao, Baocai Yin","doi":"10.1109/ICMI.2002.1167050","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167050","url":null,"abstract":"In this paper, we present several improvements on conventional active shape models (ASM) for face alignment. Despite the accuracy and robustness of ASMs in image alignment, its performance depends heavily on the initial parameters of the shape model, as well as the local texture model for each landmark and the corresponding local matching strategy. In this work, to improve ASMs for face alignment, several measures are taken. First, salient facial features, such as the eyes and the mouth, are localized based on a face detector. These salient features are then utilized to initialize the shape model and provide region constraints on the subsequent iterative shape searching. Secondly, we exploit edge information to construct better local texture models for landmarks on the face contour. The edge intensity at the contour landmark is used as a self-adaptive weight when calculating the Mahalanobis distance between the candidate and reference profile. Thirdly, to avoid unreasonable shift from pre-localized salient features, landmarks around the salient features are adjusted before applying global subspace constraints. Experiments on a database containing 300 labeled face images show that the proposed method performs significantly better than traditional ASMs.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133333330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 58
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信