Proceedings. Fourth IEEE International Conference on Multimodal Interfaces最新文献

筛选
英文 中文
Towards monitoring human activities using an omnidirectional camera 向着用全向摄像机监控人类活动的方向发展
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167032
Xilin Chen, Jie Yang
{"title":"Towards monitoring human activities using an omnidirectional camera","authors":"Xilin Chen, Jie Yang","doi":"10.1109/ICMI.2002.1167032","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167032","url":null,"abstract":"We propose an approach for monitoring human activities in an indoor environment using an omnidirectional camera. Robustly tracking people is prerequisite for modeling and recognizing human activities. An omnidirectional camera mounted on the ceiling is less prone to problems of occlusion. We use the Markov Random Field (MRF) to present both background and foreground, and adapt models effectively against environment changes. We employ a deformable model to adapt the foreground models to optimally match objects in different position within a pattern of view of the omnidirectional camera. In order to monitor human activity, we represent positions of people as spatial points and analyze moving trajectories within a time-spatial window. The method provides an efficient way to monitoring high-level human activities without exploring identities.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"623 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117085765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Towards universal speech recognition 走向通用语音识别
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167001
Zhirong Wang, Umut Topkara, Tanja Schultz, A. Waibel
{"title":"Towards universal speech recognition","authors":"Zhirong Wang, Umut Topkara, Tanja Schultz, A. Waibel","doi":"10.1109/ICMI.2002.1167001","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167001","url":null,"abstract":"The increasing interest in multilingual applications like speech-to-speech translation systems is accompanied by the need for speech recognition front-ends in many languages that can also handle multiple input languages at the same time. We describe a universal speech recognition system that fulfills such needs. It is trained by sharing speech and text data across languages and thus reduces the number of parameters and overhead significantly at the cost of only slight accuracy loss. The final recognizer eases the burden of maintaining several monolingual engines, makes dedicated language identification obsolete and allows for code-switching within an utterance. To achieve these goals we developed new methods for constructing multilingual acoustic models and multilingual n-gram language models.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117182865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Viewing and analyzing multimodal human-computer tutorial dialogue: a database approach 查看和分析多模态人机教程对话:数据库方法
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166981
Jack Mostow, J. Beck, Raghuvee Chalasani, Andrew Cuneo, Peng Jia
{"title":"Viewing and analyzing multimodal human-computer tutorial dialogue: a database approach","authors":"Jack Mostow, J. Beck, Raghuvee Chalasani, Andrew Cuneo, Peng Jia","doi":"10.1109/ICMI.2002.1166981","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166981","url":null,"abstract":"It is easier to record logs of multimodal human-computer tutorial dialogue than to make sense of them. In the 2000-2001 school year, we logged the interactions of approximately 400 students who used Project LISTEN's Reading Tutor and who read aloud over 2.4 million words. We discuss some difficulties we encountered converting the logs into a more easily understandable database. It is faster to write SQL queries to answer research questions than to analyze complex log files each time. The database also permits us to construct a viewer to examine individual Reading Tutor-student interactions. This combination of queries and viewable data has turned out to be very powerful, and we discuss how we have combined them to answer research questions.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Prosody based co-analysis for continuous recognition of coverbal gestures 基于韵律的手势连续识别协同分析
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166986
S. Kettebekov, M. Yeasin, Rajeev Sharma
{"title":"Prosody based co-analysis for continuous recognition of coverbal gestures","authors":"S. Kettebekov, M. Yeasin, Rajeev Sharma","doi":"10.1109/ICMI.2002.1166986","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166986","url":null,"abstract":"Although recognition of natural speech and gestures have been studied extensively, previous attempts at combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were co-analyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating small hand movements, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected front the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127570013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Adaptive dialog based upon multimodal language acquisition 基于多模态语言习得的自适应对话
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166982
Sorin Dusan, J. Flanagan
{"title":"Adaptive dialog based upon multimodal language acquisition","authors":"Sorin Dusan, J. Flanagan","doi":"10.1109/ICMI.2002.1166982","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166982","url":null,"abstract":"Communicating by voice with speech-enabled computer applications based on preprogrammed rule grammars suffers from constrained vocabulary and sentence structures. Deviations from the allowed language result in an unrecognized utterance that will not be understood and processed by the system. One way to alleviate this restriction consists in allowing the user to expand the computer's recognized and understood language by teaching the computer system new language knowledge. We present an adaptive dialog system capable of learning from users new words, phrases and sentences, and their corresponding meanings. User input incorporates multiple modalities, including speaking, typing, pointing, drawing and image capturing. The allowed language can thus be expanded in real time by users according to their preferences. By acquiring new language knowledge the system becomes more capable in specific tasks, although its language is still constrained.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133115913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Lecture and presentation tracking in an intelligent meeting room 智能会议室的讲座和演示跟踪
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1166967
I. Rogina, Thomas Schaaf
{"title":"Lecture and presentation tracking in an intelligent meeting room","authors":"I. Rogina, Thomas Schaaf","doi":"10.1109/ICMI.2002.1166967","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1166967","url":null,"abstract":"Archiving, indexing, and later browsing through stored presentations and lectures is increasingly being used. We have investigated the special problems and advantages of lectures and propose the design and adaptation of a speech recognizer to a lecture such that the recognition accuracy can be significantly improved by prior analysis of the presented documents using a special class-based language model. We define a tracking accuracy measure which measures how well a system can automatically align recognized words with parts of a presentation and show that by prior exploitation of the presented documents, the tracking accuracy can be improved. The system described in this paper is part of an intelligent meeting room developed in the European Union-sponsored project FAME (Facilitating Agent for Multicultural Exchange).","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126386204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Improved information maximization based face and facial feature detection from real-time video and application in a multi-modal person identification system 基于改进信息最大化的实时视频人脸和面部特征检测及其在多模态人识别系统中的应用
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167048
Ziyou Xiong, Yunqiang Chen, Roy Wang, Thomas S. Huang
{"title":"Improved information maximization based face and facial feature detection from real-time video and application in a multi-modal person identification system","authors":"Ziyou Xiong, Yunqiang Chen, Roy Wang, Thomas S. Huang","doi":"10.1109/ICMI.2002.1167048","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167048","url":null,"abstract":"In this paper an improved face detection method based on our previous information-based maximum discrimination approach is presented that maximizes the discrimination between face and non-face examples in a training set without using color or motion information. A short review of our previous method is given together with a description of a recent improvement of its detection speed. A person identification system has been developed that performs multi-modal person identification in real-time video based on this newly improved face detection method together with speaker identification.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116064863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Multi modal user interaction in an automatic pool trainer 自动泳池训练器中的多模态用户交互
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167022
L. B. Larsen, Morten Damm Jensen, Wisdom Kobby Vodzi
{"title":"Multi modal user interaction in an automatic pool trainer","authors":"L. B. Larsen, Morten Damm Jensen, Wisdom Kobby Vodzi","doi":"10.1109/ICMI.2002.1167022","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167022","url":null,"abstract":"This paper presents the human-computer interaction in an automatic pool trainer currently being developed at the Center for PersonKommunikation, Aalborg University. The aim of the system is to automate (parts of) the learning process, in this case of the game of pool. The automated pool trainer (APT) utilises multi modal, agent driven user-system communication, to facilitate the user interaction. To allow the user the necessary freedom of movement when addressing the task, system output is presented on a wall-mounted screen and is augmented by a laser drawing lines and points directly on the pool table surface. User interaction is either carried out via a spoken dialogue with an animated interface agent, or by using a touch screen panel. The paper describes the philosophy on which the system is designed, as well as the system architecture and individual modules. The user interaction is described and the paper concludes with a presentation of some test results and a discussion of the suitability of the presented and similar systems.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"379 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116578936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Flexi-modal and multi-machine user interfaces 灵活模态和多机器用户界面
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167019
B. Myers, Robert G. Malkin, M. Bett, A. Waibel, Benjamin Bostwick, Robert C. Miller, Jie Yang, Matthias Denecke, Edgar Seemann, Jie Zhu, Choon Hong Peck, Dave Kong, Jeffrey Nichols, W. Scherlis
{"title":"Flexi-modal and multi-machine user interfaces","authors":"B. Myers, Robert G. Malkin, M. Bett, A. Waibel, Benjamin Bostwick, Robert C. Miller, Jie Yang, Matthias Denecke, Edgar Seemann, Jie Zhu, Choon Hong Peck, Dave Kong, Jeffrey Nichols, W. Scherlis","doi":"10.1109/ICMI.2002.1167019","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167019","url":null,"abstract":"We describe our system which facilitates collaboration using multiple modalities, including speech, handwriting, gestures, gaze tracking, direct manipulation, large projected touch-sensitive displays, laser pointer tracking, regular monitors with a mouse and keyboard, and wireless networked handhelds. Our system allows multiple, geographically dispersed participants to simultaneously and flexibly mix different modalities using the right interface at the right time on one or more machines. We discuss each of the modalities provided, how they were integrated in the system architecture, and how the user interface enabled one or more people to flexibly use one or more devices.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131203976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Animating arbitrary topology 3D facial model using the MPEG-4 FaceDefTables 使用MPEG-4 FaceDefTables动画任意拓扑3D面部模型
Proceedings. Fourth IEEE International Conference on Multimodal Interfaces Pub Date : 2002-10-14 DOI: 10.1109/ICMI.2002.1167049
D. Jiang, Wen Gao, Zhiguo Li, Zhaoqi Wang
{"title":"Animating arbitrary topology 3D facial model using the MPEG-4 FaceDefTables","authors":"D. Jiang, Wen Gao, Zhiguo Li, Zhaoqi Wang","doi":"10.1109/ICMI.2002.1167049","DOIUrl":"https://doi.org/10.1109/ICMI.2002.1167049","url":null,"abstract":"In this paper we put forward a method to animate an arbitrary topology facial model (ATFM) based on the MPEG-4 standard. This paper deals mainly with the problem of building the FaceDefTables, which play a very important role in the MPEG-4 based facial animation system. The FaceDefTables for our predefined standard facial model (SFM) are built using the interpolation method. Since the FaceDefTables depend on facial models, the FaceDefTables for the SFM can be applied only to those facial models having the same topology as the SFM. For those facial models that have different topology, we have to build the FaceDefTables accordingly. To acquire the FaceDefTables for ATFM, we first select feature points on ATFM, then transform the SFM according to those feature points. Finally, we project each vertex on the ATFM to the transformed SFM and build the FaceDefTables for the ATFM according to the projection position. With the FaceDefTables we built, realistic animation results have been acquired.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133834648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信