$Q_{biased}$ Softmax回归算法的多偏置技术比较分析

2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS) Pub Date : 2021-04-28 DOI:10.1109/AIMS52415.2021.9466049

Muhammad Moiz, Hazique Malik, Muhammad Bilal, Noman Naseer

{"title":"$Q_{biased}$ Softmax回归算法的多偏置技术比较分析","authors":"Muhammad Moiz, Hazique Malik, Muhammad Bilal, Noman Naseer","doi":"10.1109/AIMS52415.2021.9466049","DOIUrl":null,"url":null,"abstract":"Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.","PeriodicalId":299121,"journal":{"name":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm\",\"authors\":\"Muhammad Moiz, Hazique Malik, Muhammad Bilal, Noman Naseer\",\"doi\":\"10.1109/AIMS52415.2021.9466049\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.\",\"PeriodicalId\":299121,\"journal\":{\"name\":\"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/AIMS52415.2021.9466049\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIMS52415.2021.9466049","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在过去的许多年里，机器人工人的受欢迎程度出现了巨大的增长。一些以前被认为无法完成的任务可以由机器人轻松高效地完成。这主要是由于近年来在控制系统和人工智能领域取得的进展。最近，我们看到强化学习(RL)在机器人领域引起了人们的关注。强化学习不是明确指定特定任务的解决方案，而是使机器人(代理)能够探索其环境，并通过反复试验选择适当的响应。本文对q偏软最大回归(QBIASSR)算法的偏置技术进行了比较分析。在QBIASSR中，未探索状态的决策取决于先前探索状态的集合。该算法改进了机器人到达未探索状态时的学习过程。根据经验状态的变量值计算向量偏差(s)，并将其添加到q值函数中以进行行动选择。为了获得最优的奖励，采用了不同的偏差计算技术。在移动机器人避障的情况下，对所有技术的性能进行了评估和比较。最后，我们证明了我们论文中提出的技术所产生的累积奖励至少是基线的2倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comparative Analysis of Multiple Biasing Techniques for $Q_{biased}$ Softmax Regression Algorithm

Over the past many years the popularity of robotic workers has seen a tremendous surge. Several tasks which were previously considered insurmountable are able to be performed by robots efficiently, with much ease. This is mainly due to the advances made in the field of control systems and artificial intelligence in recent years. Lately, we have seen Reinforcement Learning (RL) capture the spotlight, in the field of robotics. Instead of explicitly specifying the solution of a particular task, RL enables the robot (agent) to explore its environment and through trial and error choose the appropriate response. In this paper, a comparative analysis of biasing techniques for the Q-biased softmax regression (QBIASSR) algorithm has been presented. In QBIASSR, decision-making for un-explored states depends upon the set of previously explored states. This algorithm improves the learning process when the robot reaches unexplored states. A vector bias(s) is calculated on the basis of variable values of experienced states and added to the Q-value function for action selection. To obtain the optimized reward, different techniques to calculate bias(s) are adopted. The performance of all the techniques has been evaluated and compared for obstacle avoidance in the case of a mobile robot. In the end, we have demonstrated that the cumulative reward generated by the technique proposed in our paper is at least 2 times greater than the baseline.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 International Conference on Artificial Intelligence and Mechatronics Systems (AIMS)

自引率

0.00%

发文量