{"title":"Modular approach of multimodal integration in a virtual environment","authors":"Rajarathinam Arangarasan, George N. Phillips","doi":"10.1109/ICMI.2002.1167017","DOIUrl":null,"url":null,"abstract":"We present a novel modular approach to integrating multiple input/output (I/O) modes in a virtual environment that imitate natural, intuitive and effective human interaction behavior. The I/O modes used in this research are spatial tracking of both hands, finger gesture recognition, head/body spatial tracking, voice recognition (discrete recognition for simple commands, and continuous recognition for natural language input), immersive stereo display and synthesized speech output. Intuitive natural interaction is achieved through several stages: identifying all the tasks that need to be performed, grouping similar tasks and assigning them to a particular mode such that it imitates the physical world. This modular approach allows inclusion and removal of additional input and output modes as well as additional users. We described this multimodal interaction paradigm by applying it to a real world application: visualizing, modeling and fitting protein molecular structures in an immersive virtual environment.","PeriodicalId":208377,"journal":{"name":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. Fourth IEEE International Conference on Multimodal Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMI.2002.1167017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
We present a novel modular approach to integrating multiple input/output (I/O) modes in a virtual environment that imitate natural, intuitive and effective human interaction behavior. The I/O modes used in this research are spatial tracking of both hands, finger gesture recognition, head/body spatial tracking, voice recognition (discrete recognition for simple commands, and continuous recognition for natural language input), immersive stereo display and synthesized speech output. Intuitive natural interaction is achieved through several stages: identifying all the tasks that need to be performed, grouping similar tasks and assigning them to a particular mode such that it imitates the physical world. This modular approach allows inclusion and removal of additional input and output modes as well as additional users. We described this multimodal interaction paradigm by applying it to a real world application: visualizing, modeling and fitting protein molecular structures in an immersive virtual environment.