你又打断我了！：通过逐步澄清让语音助手对老年痴呆症更友好

Frontiers in Dementia Pub Date : 2024-03-12 DOI:10.3389/frdem.2024.1343052

Angus Addlesee, Arash Eshghi

{"title":"你又打断我了！：通过逐步澄清让语音助手对老年痴呆症更友好","authors":"Angus Addlesee, Arash Eshghi","doi":"10.3389/frdem.2024.1343052","DOIUrl":null,"url":null,"abstract":"In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan incrementally as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification request (iCR), intended to recover the rest of the truncated turn. The ability to generate iCRs in response to pauses is therefore important in building natural and robust everyday voice assistants (EVA) such as Amazon Alexa. This becomes crucial with people with dementia (PwDs) as a target user group since they are known to pause longer and more frequently, with current state-of-the-art EVAs interrupting them prematurely, leading to frustration and breakdown of the interaction. In this article, we first use two existing corpora of truncated utterances to establish the generation of clarification requests as an effective strategy for recovering from interruptions. We then proceed to report on, analyse, and release SLUICE-CR: a new corpus of 3,000 crowdsourced, human-produced iCRs, the first of its kind. We use this corpus to probe the incremental processing capability of a number of state-of-the-art large language models (LLMs) by evaluating (1) the quality of the model's generated iCRs in response to incomplete questions and (2) the ability of the said LLMs to respond correctly after the users response to the generated iCR. For (1), our experiments show that the ability to generate contextually appropriate iCRs only emerges at larger LLM sizes and only when prompted with example iCRs from our corpus. For (2), our results are in line with (1), that is, that larger LLMs interpret incremental clarificational exchanges more effectively. Overall, our results indicate that autoregressive language models (LMs) are, in principle, able to both understand and generate language incrementally and that LLMs can be configured to handle speech phenomena more commonly produced by PwDs, mitigating frustration with today's EVAs by improving their accessibility.","PeriodicalId":408305,"journal":{"name":"Frontiers in Dementia","volume":"293 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"You have interrupted me again!: making voice assistants more dementia-friendly with incremental clarification\",\"authors\":\"Angus Addlesee, Arash Eshghi\",\"doi\":\"10.3389/frdem.2024.1343052\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan incrementally as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification request (iCR), intended to recover the rest of the truncated turn. The ability to generate iCRs in response to pauses is therefore important in building natural and robust everyday voice assistants (EVA) such as Amazon Alexa. This becomes crucial with people with dementia (PwDs) as a target user group since they are known to pause longer and more frequently, with current state-of-the-art EVAs interrupting them prematurely, leading to frustration and breakdown of the interaction. In this article, we first use two existing corpora of truncated utterances to establish the generation of clarification requests as an effective strategy for recovering from interruptions. We then proceed to report on, analyse, and release SLUICE-CR: a new corpus of 3,000 crowdsourced, human-produced iCRs, the first of its kind. We use this corpus to probe the incremental processing capability of a number of state-of-the-art large language models (LLMs) by evaluating (1) the quality of the model's generated iCRs in response to incomplete questions and (2) the ability of the said LLMs to respond correctly after the users response to the generated iCR. For (1), our experiments show that the ability to generate contextually appropriate iCRs only emerges at larger LLM sizes and only when prompted with example iCRs from our corpus. For (2), our results are in line with (1), that is, that larger LLMs interpret incremental clarificational exchanges more effectively. Overall, our results indicate that autoregressive language models (LMs) are, in principle, able to both understand and generate language incrementally and that LLMs can be configured to handle speech phenomena more commonly produced by PwDs, mitigating frustration with today's EVAs by improving their accessibility.\",\"PeriodicalId\":408305,\"journal\":{\"name\":\"Frontiers in Dementia\",\"volume\":\"293 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Dementia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frdem.2024.1343052\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Dementia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frdem.2024.1343052","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在即兴交谈中，说话者很少事先对自己要说的话有一个完整的计划：他们需要在依次表达每个词的过程中逐步构思和计划。这往往会导致说话中途出现长时间的停顿。听者要么等待停顿，要么提供一个可能的补语，要么用一个渐进的澄清请求（iCR）来回应，目的是恢复被截断的剩余部分。因此，针对停顿生成 iCR 的能力对于亚马逊 Alexa 等自然、强大的日常语音助手 (EVA) 的构建非常重要。对于痴呆症患者（PwDs）这一目标用户群体来说，这一点变得至关重要，因为众所周知，他们的停顿时间更长、频率更高，而目前最先进的 EVA 会过早地打断他们，从而导致挫败感和交互中断。在本文中，我们首先利用两个现有的截断语料库，将生成澄清请求作为从中断中恢复的有效策略。接着，我们报告、分析并发布了 SLUICE-CR：一个由 3000 个众包、人工制作的 iCRs 组成的新语料库，这在同类语料库中尚属首例。我们利用这个语料库来探究一些最先进的大型语言模型（LLM）的增量处理能力，具体方法是评估（1）模型在回答不完整问题时生成的 iCR 的质量，以及（2）上述 LLM 在用户回答生成的 iCR 后做出正确回应的能力。对于第（1）点，我们的实验表明，只有当 LLM 的规模较大时，并且只有在我们的语料库中的 iCR 示例的提示下，才会出现根据上下文生成适当 iCR 的能力。对于第（2）点，我们的结果与第（1）点一致，即较大的 LLM 能更有效地解释增量说明性交流。总之，我们的研究结果表明，自回归语言模型（LMs）原则上能够理解和生成增量语言，而且可以对 LLMs 进行配置，使其能够处理残疾人更常见的语言现象，从而通过提高 EVA 的可访问性来减轻对其的失望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

You have interrupted me again!: making voice assistants more dementia-friendly with incremental clarification

In spontaneous conversation, speakers seldom have a full plan of what they are going to say in advance: they need to conceptualise and plan incrementally as they articulate each word in turn. This often leads to long pauses mid-utterance. Listeners either wait out the pause, offer a possible completion, or respond with an incremental clarification request (iCR), intended to recover the rest of the truncated turn. The ability to generate iCRs in response to pauses is therefore important in building natural and robust everyday voice assistants (EVA) such as Amazon Alexa. This becomes crucial with people with dementia (PwDs) as a target user group since they are known to pause longer and more frequently, with current state-of-the-art EVAs interrupting them prematurely, leading to frustration and breakdown of the interaction. In this article, we first use two existing corpora of truncated utterances to establish the generation of clarification requests as an effective strategy for recovering from interruptions. We then proceed to report on, analyse, and release SLUICE-CR: a new corpus of 3,000 crowdsourced, human-produced iCRs, the first of its kind. We use this corpus to probe the incremental processing capability of a number of state-of-the-art large language models (LLMs) by evaluating (1) the quality of the model's generated iCRs in response to incomplete questions and (2) the ability of the said LLMs to respond correctly after the users response to the generated iCR. For (1), our experiments show that the ability to generate contextually appropriate iCRs only emerges at larger LLM sizes and only when prompted with example iCRs from our corpus. For (2), our results are in line with (1), that is, that larger LLMs interpret incremental clarificational exchanges more effectively. Overall, our results indicate that autoregressive language models (LMs) are, in principle, able to both understand and generate language incrementally and that LLMs can be configured to handle speech phenomena more commonly produced by PwDs, mitigating frustration with today's EVAs by improving their accessibility.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Dementia

自引率

0.00%

发文量