一个视觉和语音支持，可定制，智能环境的虚拟助手

2018 11th International Conference on Human System Interaction (HSI) Pub Date : 2018-07-01 DOI:10.1109/HSI.2018.8431232

G. Iannizzotto, L. L. Bello, Andrea Nucita, G. Grasso

{"title":"一个视觉和语音支持，可定制，智能环境的虚拟助手","authors":"G. Iannizzotto, L. L. Bello, Andrea Nucita, G. Grasso","doi":"10.1109/HSI.2018.8431232","DOIUrl":null,"url":null,"abstract":"Recent developments in smart assistants and smart home automation are lately attracting the interest and curiosity of consumers and researchers. Speech enabled virtual assistants (often named smart speakers) offer a wide variety of network-oriented services and, in some cases, can connect to smart environments, thus enhancing them with new and effective user interfaces. However, such devices also reveal new needs and some weaknesses. In particular, they represent faceless and blind assistants, unable to show a face, and therefore an emotion, and unable to ‘see’ the user. As a consequence, the interaction is impaired and, in some cases, ineffective. Moreover, most of those devices heavily rely on cloud-based services, thus transmitting potentially sensitive data to remote servers. To overcome such issues, in this paper we combine some of the most advanced techniques in computer vision, deep learning, speech generation and recognition, and artificial intelligence, into a virtual assistant architecture for smart home automation systems. The proposed assistant is effective and resource-efficient, interactive and customizable, and the realized prototype runs on a low-cost, small-sized, Raspberry PI 3 device. For testing purposes, the system was integrated with an open source home automation environment and ran for several days, while people were encouraged to interact with it, and proved to be accurate, reliable and appealing.","PeriodicalId":441117,"journal":{"name":"2018 11th International Conference on Human System Interaction (HSI)","volume":"269 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"31","resultStr":"{\"title\":\"A Vision and Speech Enabled, Customizable, Virtual Assistant for Smart Environments\",\"authors\":\"G. Iannizzotto, L. L. Bello, Andrea Nucita, G. Grasso\",\"doi\":\"10.1109/HSI.2018.8431232\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent developments in smart assistants and smart home automation are lately attracting the interest and curiosity of consumers and researchers. Speech enabled virtual assistants (often named smart speakers) offer a wide variety of network-oriented services and, in some cases, can connect to smart environments, thus enhancing them with new and effective user interfaces. However, such devices also reveal new needs and some weaknesses. In particular, they represent faceless and blind assistants, unable to show a face, and therefore an emotion, and unable to ‘see’ the user. As a consequence, the interaction is impaired and, in some cases, ineffective. Moreover, most of those devices heavily rely on cloud-based services, thus transmitting potentially sensitive data to remote servers. To overcome such issues, in this paper we combine some of the most advanced techniques in computer vision, deep learning, speech generation and recognition, and artificial intelligence, into a virtual assistant architecture for smart home automation systems. The proposed assistant is effective and resource-efficient, interactive and customizable, and the realized prototype runs on a low-cost, small-sized, Raspberry PI 3 device. For testing purposes, the system was integrated with an open source home automation environment and ran for several days, while people were encouraged to interact with it, and proved to be accurate, reliable and appealing.\",\"PeriodicalId\":441117,\"journal\":{\"name\":\"2018 11th International Conference on Human System Interaction (HSI)\",\"volume\":\"269 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"31\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 11th International Conference on Human System Interaction (HSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HSI.2018.8431232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 11th International Conference on Human System Interaction (HSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HSI.2018.8431232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 31

摘要

智能助理和智能家居自动化的最新发展最近吸引了消费者和研究人员的兴趣和好奇心。支持语音的虚拟助手(通常称为智能扬声器)提供各种面向网络的服务，在某些情况下，可以连接到智能环境，从而通过新的有效的用户界面增强它们。然而，这些设备也暴露出新的需求和一些弱点。特别是，它们代表着没有面孔的盲人助手，无法显示面孔，因此也无法显示情感，无法“看到”用户。因此，这种互动受到了损害，在某些情况下，甚至是无效的。此外，这些设备大多严重依赖基于云的服务，因此会将潜在的敏感数据传输到远程服务器。为了克服这些问题，在本文中，我们将计算机视觉，深度学习，语音生成和识别以及人工智能方面的一些最先进的技术结合到智能家居自动化系统的虚拟助理架构中。所提出的助手是有效的，资源高效的，交互式的和可定制的，并且实现的原型运行在一个低成本的，小尺寸的树莓派3设备上。为了测试目的，该系统与一个开源的家庭自动化环境集成并运行了几天，同时鼓励人们与它进行交互，并证明了它的准确性、可靠性和吸引力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Vision and Speech Enabled, Customizable, Virtual Assistant for Smart Environments

Recent developments in smart assistants and smart home automation are lately attracting the interest and curiosity of consumers and researchers. Speech enabled virtual assistants (often named smart speakers) offer a wide variety of network-oriented services and, in some cases, can connect to smart environments, thus enhancing them with new and effective user interfaces. However, such devices also reveal new needs and some weaknesses. In particular, they represent faceless and blind assistants, unable to show a face, and therefore an emotion, and unable to ‘see’ the user. As a consequence, the interaction is impaired and, in some cases, ineffective. Moreover, most of those devices heavily rely on cloud-based services, thus transmitting potentially sensitive data to remote servers. To overcome such issues, in this paper we combine some of the most advanced techniques in computer vision, deep learning, speech generation and recognition, and artificial intelligence, into a virtual assistant architecture for smart home automation systems. The proposed assistant is effective and resource-efficient, interactive and customizable, and the realized prototype runs on a low-cost, small-sized, Raspberry PI 3 device. For testing purposes, the system was integrated with an open source home automation environment and ran for several days, while people were encouraged to interact with it, and proved to be accurate, reliable and appealing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 11th International Conference on Human System Interaction (HSI)

自引率

0.00%

发文量