{"title":"Talking to machines (abstract)","authors":"C. Cowley, Dylan M. Jones","doi":"10.1145/169059.169512","DOIUrl":null,"url":null,"abstract":"Despite some extravagant claims made regarding the potential of machines which recognize speech inpuL it is unlikely that they will ever match the speech processing capabilities of humans. The truly conversational computer is still a long way from being realiz@ and the performance of many contempomry recognizes leaves much to be desired. A machine that may perform efficiently for the experienced user in a controlled laboratory setting can often present substantial problems for unskilled users when it is finally installed in the workplace. Building reliable and acceptable speech interfaces is a subtle and sophisticated process. There is a common assumption that speech interfaces automatically y improve system performance when, in fact, the opposite is often the case, particularly if speech is simply added to an existing system rather than included from the outset of development. However, if human factors issues are addressed at the inception of projects rather than as an afterthought and the machines capabilities are not overloaded by overambitious design, a great deal can be achieved with devices that can reliably recognize a few selected utterances and take full advantage of the unique properties of spoken dialogue. The user is afforded greater freedom of movements and is thus released from the constraints imposed by conventional keyboardkreen interaction. Furthermore there is the option of multi-tasking using speech and manual interfaces concnrrentt y. The film shows how dialogue design and elror correction strategies, informed by human factors research, can lead to the development of usable and profitable systems. It starts with a simulation of a truly conversational machine to show the level of Permission to copy without fee all or part of this material is granted provided that the copies ara not mada or distributed for diract commercial advantage, the ACM copyright notioe and tha titla of the publication and its date appaar, and notica is given that copying is by permission of tha Association for Computing Machinary. To copy otherwise, or to republish, requires a fee and/or specifio permission. performance necessary to compete with human recognition. Template matching recognition is clemly explained so that viewem can see how most devices actually work. The film then shows the Digital Equipment Corporation’s DECvoice in a number of voice input and output scenarios which highlight typical design problems and solutions. It concludes with a set of guidelines which will help designers make reasoned decisions about when and how to use speech recognition and avoid the typical problems experienced by users. The film ends with an example of a system which, having been designed with the guidelines in mind, is usable, efficient, and practical within the constraints of contemporary technology. GUIDELINES FOR SYSTEM DESIGNERS 1. Train the machine in the place it will be used. 2. Use speech consistently for one part of a task. 3. Do not use speech too often. 4. Do not use voice input for spatial information. 5. Develop a special command vocabulary. 6. Incorporate clear error-correction strategies. 7. Provide feedback about the recognizer’s activities. 8. Use multiple criteria to evatuate the system.","PeriodicalId":407219,"journal":{"name":"Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the INTERACT '93 and CHI '93 Conference on Human Factors in Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/169059.169512","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Despite some extravagant claims made regarding the potential of machines which recognize speech inpuL it is unlikely that they will ever match the speech processing capabilities of humans. The truly conversational computer is still a long way from being realiz@ and the performance of many contempomry recognizes leaves much to be desired. A machine that may perform efficiently for the experienced user in a controlled laboratory setting can often present substantial problems for unskilled users when it is finally installed in the workplace. Building reliable and acceptable speech interfaces is a subtle and sophisticated process. There is a common assumption that speech interfaces automatically y improve system performance when, in fact, the opposite is often the case, particularly if speech is simply added to an existing system rather than included from the outset of development. However, if human factors issues are addressed at the inception of projects rather than as an afterthought and the machines capabilities are not overloaded by overambitious design, a great deal can be achieved with devices that can reliably recognize a few selected utterances and take full advantage of the unique properties of spoken dialogue. The user is afforded greater freedom of movements and is thus released from the constraints imposed by conventional keyboardkreen interaction. Furthermore there is the option of multi-tasking using speech and manual interfaces concnrrentt y. The film shows how dialogue design and elror correction strategies, informed by human factors research, can lead to the development of usable and profitable systems. It starts with a simulation of a truly conversational machine to show the level of Permission to copy without fee all or part of this material is granted provided that the copies ara not mada or distributed for diract commercial advantage, the ACM copyright notioe and tha titla of the publication and its date appaar, and notica is given that copying is by permission of tha Association for Computing Machinary. To copy otherwise, or to republish, requires a fee and/or specifio permission. performance necessary to compete with human recognition. Template matching recognition is clemly explained so that viewem can see how most devices actually work. The film then shows the Digital Equipment Corporation’s DECvoice in a number of voice input and output scenarios which highlight typical design problems and solutions. It concludes with a set of guidelines which will help designers make reasoned decisions about when and how to use speech recognition and avoid the typical problems experienced by users. The film ends with an example of a system which, having been designed with the guidelines in mind, is usable, efficient, and practical within the constraints of contemporary technology. GUIDELINES FOR SYSTEM DESIGNERS 1. Train the machine in the place it will be used. 2. Use speech consistently for one part of a task. 3. Do not use speech too often. 4. Do not use voice input for spatial information. 5. Develop a special command vocabulary. 6. Incorporate clear error-correction strategies. 7. Provide feedback about the recognizer’s activities. 8. Use multiple criteria to evatuate the system.