{"title":"Visual panel: virtual mouse, keyboard and 3D controller with an ordinary piece of paper","authors":"Zhengyou Zhang, Ying Wu, Ying Shan, S. Shafer","doi":"10.1145/971478.971522","DOIUrl":"https://doi.org/10.1145/971478.971522","url":null,"abstract":"This paper presents a vision-based interface system, VISUAL PANEL, which employs an arbitrary quadrangle-shaped panel (e.g., an ordinary piece of paper) and a tip pointer (e.g., fingertip) as an intuitive, wireless and mobile input device. The system can accurately and reliably track the panel and the tip pointer. The panel tracking continuously determines the projective mapping between the panel at the current position and the display, which in turn maps the tip position to the corresponding position on the display. By detecting the clicking and dragging actions, the system can fulfill many tasks such as controlling a remote large display, and simulating a physical keyboard. Users can naturally use their fingers or other tip pointers to issue commands and type texts. Furthermore, by tracking the 3D position and orientation of the visual panel, the system can also provide 3D information, serving as a virtual joystick, to control 3D virtual objects.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129136045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experimental evaluation of vision and speech based multimodal interfaces","authors":"Emilio Schapira, Rajeev Sharma","doi":"10.1145/971478.971481","DOIUrl":"https://doi.org/10.1145/971478.971481","url":null,"abstract":"Progress in computer vision and speech recognition technologies has recently enabled multimodal interfaces that use speech and gestures. These technologies o er promising alternatives to existing interfaces because they emulate the natural way in which humans communicate. However, no systematic work has been reported that formally evaluates the new speech/gesture interfaces. This paper is concerned with formal experimental evaluation of new human-computer interactions enabled by speech and hand gestures.The paper describes an experiment conducted with 23 subjects that evaluates selection strategies for interaction with large screen displays. The multimodal interface designed for this experiment does not require the user to be in physical contact with any device. Video cameras and long range microphones are used as input for the system. Three selection strategies are evaluated and results for Different target sizes and positions are reported in terms of accuracy, selection times and user preference. Design implications for vision/speech based interfaces are inferred from these results. This study also raises new question and topics for future research.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131274052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bare-hand human-computer interaction","authors":"C. Hardenberg, F. Bérard","doi":"10.1145/971478.971513","DOIUrl":"https://doi.org/10.1145/971478.971513","url":null,"abstract":"In this paper, we describe techniques for barehanded interaction between human and computer. Barehanded means that no device and no wires are attached to the user, who controls the computer directly with the movements of his/her hand.Our approach is centered on the needs of the user. We therefore define requirements for real-time barehanded interaction, derived from application scenarios and usability considerations. Based on those requirements a finger-finding and hand-posture recognition algorithm is developed and evaluated.To demonstrate the strength of the algorithm, we build three sample applications. Finger tracking and hand posture recognition are used to paint virtually onto the wall, to control a presentation with hand postures, and to move virtual items on the wall during a brainstorming session. We conclude the paper with user tests, which were conducted to prove the usability of bare-hand human computer interaction.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133608715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A robust algorithm for reading detection","authors":"Christopher S. Campbell, P. Maglio","doi":"10.1145/971478.971503","DOIUrl":"https://doi.org/10.1145/971478.971503","url":null,"abstract":"As video cameras become cheaper and more pervasive, there is now increased opportunity for user interfaces to take advantage of user gaze data. Eye movements provide a powerful source of information that can be used to determine user intentions and interests. In this paper, we develop and test a method for recognizing when users are reading text based solely on eye-movement data. The experimental results show that our reading detection method is robust to noise, individual differences, and variations in text difficulty. Compared to a simple detection algorithm, our algorithm reliably, quickly, and accurately recognizes and tracks reading. Thus, we provide a means to capture normal user activity, enabling interfaces that incorporate more natural interactions of human and computer.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123882444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David R. McGee, M. Pavel, A. Adami, Guoping Wang, Philip R. Cohen
{"title":"A visual modality for the augmentation of paper","authors":"David R. McGee, M. Pavel, A. Adami, Guoping Wang, Philip R. Cohen","doi":"10.1145/971478.971480","DOIUrl":"https://doi.org/10.1145/971478.971480","url":null,"abstract":"In this paper we describe how we have enhanced our multimodal paper-based system, Rasa, with visual perceptual input. We briefly explain how Rasa improves upon current decision-support tools by augmenting, rather than replacing, the paper-based tools that people in command and control centers have come to rely upon. We note shortcomings in our initial approach, discuss how we have added computer-vision as another input modality in our multimodal fusion system, and characterize the advantages that it has to offer. We conclude by discussing our current limitations and the work we intend to pursue to overcome them in the future.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122797210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jie Yang, Jiang Gao, Ying Zhang, Xilin Chen, A. Waibel
{"title":"An automatic sign recognition and translation system","authors":"Jie Yang, Jiang Gao, Ying Zhang, Xilin Chen, A. Waibel","doi":"10.1145/971478.971490","DOIUrl":"https://doi.org/10.1145/971478.971490","url":null,"abstract":"A sign is something that suggests the presence of a fact, condition, or quality. Signs are everywhere in our lives. They make our lives easier when we are familiar with them. But sometimes they pose problems. For example, a tourist might not be able to understand signs in a foreign country. This paper discusses problems of automatic sign recognition and translation. We present a system capable of capturing images, detecting and recognizing signs, and translating them into a target language. We describe methods for automatic sign extraction and translation. We use a user-centered approach in system development. The approach takes advantage of human intelligence if needed and leverage human capabilities. We are currently working on Chinese sign translation. We have developed a prototype system that can recognize Chinese sign input from a video camera that is a common gadget for a tourist, and translate the signs into English or voice stream. The sign translation, in conjunction with spoken language translation, can help international tourists to overcome language barriers. The technology can also help a visually handicapped person to increase environmental awareness.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127733240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Althoff, G. McGlaun, Björn Schuller, Peter Morguet, M. Lang
{"title":"Using multimodal interaction to navigate in arbitrary virtual VRML worlds","authors":"F. Althoff, G. McGlaun, Björn Schuller, Peter Morguet, M. Lang","doi":"10.1145/971478.971494","DOIUrl":"https://doi.org/10.1145/971478.971494","url":null,"abstract":"In this paper we present a multimodal interface for navigating in arbitrary virtual VRML worlds. Conventional haptic devices like keyboard, mouse, joystick and touchscreen can freely be combined with special Virtual-Reality hardware like spacemouse, data glove and position tracker. As a key feature, the system additionally provides intuitive input by command and natural speech utterances as well as dynamic head and hand gestures. The commuication of the interface components is based on the abstract formalism of a context-free grammar, allowing the representation of device-independent information. Taking into account the current system context, user interactions are combined in a semantic unification process and mapped on a model of the viewer's functionality vocabulary. To integrate the continuous multimodal information stream we use a straight-forward rule-based approach and a new technique based on evolutionary algorithms. Our navigation interface has extensively been evaluated in usability studies, obtaining excellent results.","PeriodicalId":416822,"journal":{"name":"Workshop on Perceptive User Interfaces","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115082428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}