使用强化学习的小提琴演奏机器人运动生成系统，学习条件和声压值变化导致的弓弦参数差异。

IF 2.9 Q2 ROBOTICS

Frontiers in Robotics and AI Pub Date : 2024-11-28 eCollection Date: 2024-01-01 DOI:10.3389/frobt.2024.1439629

Kenzo Horigome, Koji Shibuya

{"title":"使用强化学习的小提琴演奏机器人运动生成系统，学习条件和声压值变化导致的弓弦参数差异。","authors":"Kenzo Horigome, Koji Shibuya","doi":"10.3389/frobt.2024.1439629","DOIUrl":null,"url":null,"abstract":"Recently, research on human-robot communication attracts many researchers. We believe that music is one of the important channel between human and robot, because it can convey emotional information. In this research, we focus on the violin performance by a robot. Building a system capable of determining performance from a musical score will leads to better understanding communication through music. In this study, we aim to develop a system that can automatically determine bowing parameters, such as bow speed and bowing direction, from musical scores for a violin-playing robot to produce expressive sounds using reinforcement learning. We adopted Q-learning and ε-greedy methods. In addition, we utilized a neural network to approximate the value function. Our system uses a musical score that incorporates the sound pressure value of each note to determine the bowing speed and direction. This study introduces the design of this system. It also presents simulation results on the differences in bowing parameters caused by changes in learning conditions and sound-pressure values. Regarding learning conditions, the learning rate, discount rate, search rate, and the number of units in the hidden layer in the neural network were changed in the simulation. We used the last two bars of the score and the entire four bars in the first phrase of \"Go Tell Aunt Rhody.\" We determined the number of units in each layer and conducted simulations. Additionally, we conducted an analysis by adjusting the target sound pressure for each note in the score. As a result, negative rewards decreased and positive rewards increased. Consequently, even with changes in target sound pressure in both the last two bars and the entire four bars, the violin-playing robot can automatically play from the score by improving reinforcement learning. It has become clear that achieving an expressive performance using this method is possible.","PeriodicalId":47597,"journal":{"name":"Frontiers in Robotics and AI","volume":"11 ","pages":"1439629"},"PeriodicalIF":2.9000,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634750/pdf/","citationCount":"0","resultStr":"{\"title\":\"Motion-generation system for violin-playing robot using reinforcement learning differences in bowing parameters due to changes in learning conditions and sound pressure values.\",\"authors\":\"Kenzo Horigome, Koji Shibuya\",\"doi\":\"10.3389/frobt.2024.1439629\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, research on human-robot communication attracts many researchers. We believe that music is one of the important channel between human and robot, because it can convey emotional information. In this research, we focus on the violin performance by a robot. Building a system capable of determining performance from a musical score will leads to better understanding communication through music. In this study, we aim to develop a system that can automatically determine bowing parameters, such as bow speed and bowing direction, from musical scores for a violin-playing robot to produce expressive sounds using reinforcement learning. We adopted Q-learning and ε-greedy methods. In addition, we utilized a neural network to approximate the value function. Our system uses a musical score that incorporates the sound pressure value of each note to determine the bowing speed and direction. This study introduces the design of this system. It also presents simulation results on the differences in bowing parameters caused by changes in learning conditions and sound-pressure values. Regarding learning conditions, the learning rate, discount rate, search rate, and the number of units in the hidden layer in the neural network were changed in the simulation. We used the last two bars of the score and the entire four bars in the first phrase of \\\"Go Tell Aunt Rhody.\\\" We determined the number of units in each layer and conducted simulations. Additionally, we conducted an analysis by adjusting the target sound pressure for each note in the score. As a result, negative rewards decreased and positive rewards increased. Consequently, even with changes in target sound pressure in both the last two bars and the entire four bars, the violin-playing robot can automatically play from the score by improving reinforcement learning. It has become clear that achieving an expressive performance using this method is possible.\",\"PeriodicalId\":47597,\"journal\":{\"name\":\"Frontiers in Robotics and AI\",\"volume\":\"11 \",\"pages\":\"1439629\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634750/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Robotics and AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frobt.2024.1439629\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Robotics and AI","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frobt.2024.1439629","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

近来，有关人机交流的研究吸引了众多研究人员。我们认为，音乐是人类与机器人交流的重要渠道之一，因为它可以传递情感信息。在这项研究中，我们重点关注机器人的小提琴演奏。建立一个能够根据乐谱判断演奏情况的系统将有助于更好地理解通过音乐进行的交流。在本研究中，我们的目标是开发一种系统，该系统可以从乐谱中自动确定弓速和弓向等弓法参数，并通过强化学习让小提琴演奏机器人发出富有表现力的声音。我们采用了 Q-learning 和 ε-greedy 方法。此外，我们还利用神经网络来近似值函数。我们的系统使用乐谱，结合每个音符的声压值来确定弓速和方向。本研究介绍了该系统的设计。研究还介绍了学习条件和声压值变化对弓法参数影响的模拟结果。在学习条件方面，模拟中改变了神经网络的学习率、折扣率、搜索率和隐藏层的单元数。我们使用了乐谱的最后两小节和 "去告诉罗迪阿姨 "第一乐句的全部四小节。我们确定了每一层的单元数，并进行了模拟。此外，我们还通过调整乐谱中每个音符的目标声压进行了分析。结果，负奖励减少，正奖励增加。因此，即使最后两小节和整个四小节的目标声压发生变化，小提琴演奏机器人也能通过改进强化学习，根据乐谱自动演奏。显然，使用这种方法实现富有表现力的演奏是可能的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Motion-generation system for violin-playing robot using reinforcement learning differences in bowing parameters due to changes in learning conditions and sound pressure values.

Recently, research on human-robot communication attracts many researchers. We believe that music is one of the important channel between human and robot, because it can convey emotional information. In this research, we focus on the violin performance by a robot. Building a system capable of determining performance from a musical score will leads to better understanding communication through music. In this study, we aim to develop a system that can automatically determine bowing parameters, such as bow speed and bowing direction, from musical scores for a violin-playing robot to produce expressive sounds using reinforcement learning. We adopted Q-learning and ε-greedy methods. In addition, we utilized a neural network to approximate the value function. Our system uses a musical score that incorporates the sound pressure value of each note to determine the bowing speed and direction. This study introduces the design of this system. It also presents simulation results on the differences in bowing parameters caused by changes in learning conditions and sound-pressure values. Regarding learning conditions, the learning rate, discount rate, search rate, and the number of units in the hidden layer in the neural network were changed in the simulation. We used the last two bars of the score and the entire four bars in the first phrase of "Go Tell Aunt Rhody." We determined the number of units in each layer and conducted simulations. Additionally, we conducted an analysis by adjusting the target sound pressure for each note in the score. As a result, negative rewards decreased and positive rewards increased. Consequently, even with changes in target sound pressure in both the last two bars and the entire four bars, the violin-playing robot can automatically play from the score by improving reinforcement learning. It has become clear that achieving an expressive performance using this method is possible.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Frontiers in Robotics and AI ROBOTICS-

CiteScore

6.50

自引率

5.90%

发文量

355

审稿时长

14 weeks

期刊介绍： Frontiers in Robotics and AI publishes rigorously peer-reviewed research covering all theory and applications of robotics, technology, and artificial intelligence, from biomedical to space robotics.