在基于语音的 "人在回路 "系统中用大型语言模型取代人类

Proceedings of the AAAI Symposium Series Pub Date : 2024-05-20 DOI:10.1609/aaaiss.v3i1.31178

Shih-Hong Huang, Ting-Hao 'Kenneth' Huang

{"title":"在基于语音的 \"人在回路 \"系统中用大型语言模型取代人类","authors":"Shih-Hong Huang, Ting-Hao 'Kenneth' Huang","doi":"10.1609/aaaiss.v3i1.31178","DOIUrl":null,"url":null,"abstract":"It is easy to assume that Large Language Models (LLMs) will seamlessly take over applications, especially those that are largely automated. In the case of conversational voice assistants, commercial systems have been widely deployed and used over the past decade. However, are we indeed on the cusp of the future we envisioned? There exists a social-technical gap between what people want to accomplish and the actual capability of technology. In this paper, we present a case study comparing two voice assistants built on Amazon Alexa: one employing a human-in-the-loop workflow, the other utilizes LLM to engage in conversations with users. In our comparison, we discovered that the issues arising in current human-in-the-loop and LLM systems are not identical. However, the presence of a set of similar issues in both systems leads us to believe that focusing on the interaction between users and systems is crucial, perhaps even more so than focusing solely on the underlying technology itself. Merely enhancing the performance of the workers or the models may not adequately address these issues. This observation prompts our research question: What are the overlooked contributing factors in the effort to improve the capabilities of voice assistants, which might not have been emphasized in prior research?","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":"45 10","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"On Replacing Humans with Large Language Models in Voice-Based Human-in-the-Loop Systems\",\"authors\":\"Shih-Hong Huang, Ting-Hao 'Kenneth' Huang\",\"doi\":\"10.1609/aaaiss.v3i1.31178\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is easy to assume that Large Language Models (LLMs) will seamlessly take over applications, especially those that are largely automated. In the case of conversational voice assistants, commercial systems have been widely deployed and used over the past decade. However, are we indeed on the cusp of the future we envisioned? There exists a social-technical gap between what people want to accomplish and the actual capability of technology. In this paper, we present a case study comparing two voice assistants built on Amazon Alexa: one employing a human-in-the-loop workflow, the other utilizes LLM to engage in conversations with users. In our comparison, we discovered that the issues arising in current human-in-the-loop and LLM systems are not identical. However, the presence of a set of similar issues in both systems leads us to believe that focusing on the interaction between users and systems is crucial, perhaps even more so than focusing solely on the underlying technology itself. Merely enhancing the performance of the workers or the models may not adequately address these issues. This observation prompts our research question: What are the overlooked contributing factors in the effort to improve the capabilities of voice assistants, which might not have been emphasized in prior research?\",\"PeriodicalId\":516827,\"journal\":{\"name\":\"Proceedings of the AAAI Symposium Series\",\"volume\":\"45 10\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the AAAI Symposium Series\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/aaaiss.v3i1.31178\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the AAAI Symposium Series","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/aaaiss.v3i1.31178","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人们很容易假定，大型语言模型（LLM）将无缝接管各种应用，尤其是那些基本自动化的应用。就会话语音助手而言，商业系统在过去十年中得到了广泛部署和使用。然而，我们是否真的站在了我们所设想的未来的风口浪尖上？人们想要实现的目标与技术的实际能力之间存在着社会技术差距。在本文中，我们介绍了一项案例研究，比较了基于亚马逊 Alexa 的两个语音助手：一个采用了人在回路中的工作流程，另一个则利用 LLM 与用户进行对话。在比较过程中，我们发现当前人工智能系统和 LLM 系统中出现的问题并不相同。不过，两个系统中存在的一系列类似问题让我们相信，关注用户与系统之间的互动至关重要，这或许比只关注底层技术本身更为重要。仅仅提高工作人员或模型的性能可能无法充分解决这些问题。这一观察结果引发了我们的研究问题：在努力提高语音助手能力的过程中，有哪些因素被忽视了，而这些因素在以前的研究中可能并没有得到重视？

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

On Replacing Humans with Large Language Models in Voice-Based Human-in-the-Loop Systems

It is easy to assume that Large Language Models (LLMs) will seamlessly take over applications, especially those that are largely automated. In the case of conversational voice assistants, commercial systems have been widely deployed and used over the past decade. However, are we indeed on the cusp of the future we envisioned? There exists a social-technical gap between what people want to accomplish and the actual capability of technology. In this paper, we present a case study comparing two voice assistants built on Amazon Alexa: one employing a human-in-the-loop workflow, the other utilizes LLM to engage in conversations with users. In our comparison, we discovered that the issues arising in current human-in-the-loop and LLM systems are not identical. However, the presence of a set of similar issues in both systems leads us to believe that focusing on the interaction between users and systems is crucial, perhaps even more so than focusing solely on the underlying technology itself. Merely enhancing the performance of the workers or the models may not adequately address these issues. This observation prompts our research question: What are the overlooked contributing factors in the effort to improve the capabilities of voice assistants, which might not have been emphasized in prior research?

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the AAAI Symposium Series

自引率

0.00%

发文量