Making humanoid robots teaching assistants by using natural language processing (NLP) cloud-based services

Journal of Mechatronics and Artificial Intelligence in Engineering Pub Date : 2022-06-24 DOI:10.21595/jmai.2022.22720

A. Lekova, Paulina Tsvetkova, T. Tanev, P. Mitrouchev, S. Kostova

{"title":"Making humanoid robots teaching assistants by using natural language processing (NLP) cloud-based services","authors":"A. Lekova, Paulina Tsvetkova, T. Tanev, P. Mitrouchev, S. Kostova","doi":"10.21595/jmai.2022.22720","DOIUrl":null,"url":null,"abstract":"Humanoid robots have a substantial potential to serve as teaching and social assistants. However, the expectations of the children from robots to interact like humans are huge. This study presents a general model for understanding the natural language in human-robot interaction by applying Generative Pre-trained Transformer (GPT) language models as a service in the Internet of Things. Thus, the physical presence of the robot can help in fine-tuning the GPT model by prompts derived from the environmental context and subsequent robot actions for embodiment understanding of the GPT outputs. The model uses web or cloud services for Natural Language Processing (NLP) to produce and play human-like text, question answering or text generation. Verbal questions are processed either via a local speech recognition software or via a Speech-to-Text (STT) cloud service. The converted question into machine-readable code is sent to one of the GPT language models with zero- or few-shot learning prompts. GPT-J model has been tested and deployed either in the web or cloud with options for varying the parameters for controlling the haphazardness of the generated text. The robot produces human-like text by using Text-to-Speech (TTS) cloud services that convert the response into audio format rendered on the robot to be played. Useful requirements how the model to be used in order to be feasible were determined based on the evaluation of the outputs given from the different NLP and GPT-J web or cloud-services. We designed and implemented the model in order to be used by a humanoid NAO-type robot in the speech language therapy practice, however it can be used for other open-source and programmable robots and in different contexts.","PeriodicalId":314911,"journal":{"name":"Journal of Mechatronics and Artificial Intelligence in Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Mechatronics and Artificial Intelligence in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21595/jmai.2022.22720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Humanoid robots have a substantial potential to serve as teaching and social assistants. However, the expectations of the children from robots to interact like humans are huge. This study presents a general model for understanding the natural language in human-robot interaction by applying Generative Pre-trained Transformer (GPT) language models as a service in the Internet of Things. Thus, the physical presence of the robot can help in fine-tuning the GPT model by prompts derived from the environmental context and subsequent robot actions for embodiment understanding of the GPT outputs. The model uses web or cloud services for Natural Language Processing (NLP) to produce and play human-like text, question answering or text generation. Verbal questions are processed either via a local speech recognition software or via a Speech-to-Text (STT) cloud service. The converted question into machine-readable code is sent to one of the GPT language models with zero- or few-shot learning prompts. GPT-J model has been tested and deployed either in the web or cloud with options for varying the parameters for controlling the haphazardness of the generated text. The robot produces human-like text by using Text-to-Speech (TTS) cloud services that convert the response into audio format rendered on the robot to be played. Useful requirements how the model to be used in order to be feasible were determined based on the evaluation of the outputs given from the different NLP and GPT-J web or cloud-services. We designed and implemented the model in order to be used by a humanoid NAO-type robot in the speech language therapy practice, however it can be used for other open-source and programmable robots and in different contexts.

查看原文本刊更多论文

通过使用基于自然语言处理(NLP)的云服务，使人形机器人成为助教

人形机器人在教学和社交助手方面具有巨大的潜力。然而，孩子们对机器人像人类一样互动的期望是巨大的。本研究通过在物联网中应用生成式预训练转换器(GPT)语言模型作为服务，提出了一个理解人机交互中自然语言的通用模型。因此，机器人的物理存在可以通过来自环境上下文和随后的机器人动作的提示来帮助微调GPT模型，从而实现对GPT输出的实施例理解。该模型使用网络或云服务进行自然语言处理(NLP)，以生成和播放类似人类的文本、问答或文本生成。口头问题通过本地语音识别软件或语音到文本(STT)云服务进行处理。转换成机器可读代码的问题被发送到GPT语言模型之一，并带有零次或几次学习提示。GPT-J模型已经在网络或云中进行了测试和部署，并提供了控制生成文本随机性的参数变化选项。机器人通过使用文本到语音(TTS)云服务产生类似人类的文本，该服务将响应转换为机器人要播放的音频格式。根据对不同NLP和GPT-J网络或云服务给出的输出的评估，确定了有用的要求，即如何使用模型以使其可行。我们设计并实现了该模型，以供人形nao型机器人在语言治疗实践中使用，但它也可以用于其他开源和可编程机器人以及不同的环境。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Mechatronics and Artificial Intelligence in Engineering

自引率

0.00%

发文量