Reinforcement learning methods for network-based transfer parameter selection

Intelligence & Robotics Pub Date : 2023-08-31 DOI:10.20517/ir.2023.23

Yue (Sophie) Guo, Yu Wang, I-Hsuan Yang, K. Sycara

{"title":"Reinforcement learning methods for network-based transfer parameter selection","authors":"Yue (Sophie) Guo, Yu Wang, I-Hsuan Yang, K. Sycara","doi":"10.20517/ir.2023.23","DOIUrl":null,"url":null,"abstract":"A significant challenge in self-driving technology involves the domain-specific training of prediction models on intentions of other surrounding vehicles. Separately processing domain-specific models requires substantial human resources, time, and equipment for data collection and training. For instance, substantial difficulties arise when directly applying a prediction model developed with data from China to the United States market due to complex factors such as differing driving behaviors and traffic rules. The emergence of transfer learning seems to offer solutions, enabling the reuse of models and data to enhance prediction efficiency across international markets. However, many transfer learning methods require a comparison between source and target data domains to determine what can be transferred, a process that can often be legally restricted. A specialized area of transfer learning, known as network-based transfer, could potentially provide a solution. This approach involves pre-training and fine-tuning \"student\" models using selected parameters from a \"teacher\" model. However, as networks typically have a large number of parameters, it raises questions about the most efficient methods for parameter selection to optimize transfer learning. An automatic parameter selector through reinforcement learning has been developed in this paper, named \"Automatic Transfer Selector via Reinforcement Learning\". This technique enhances the efficiency of parameter selection for transfer prediction between international self-driving markets, in contrast to manual methods. With this innovative approach, technicians are relieved from the labor-intensive task of testing each parameter combination, or enduring lengthy training periods to evaluate the impact of prediction transfer. Experiments have been conducted using a temporal convolutional neural network fully trained with the data from the Chinese market and one month's US data, focusing on improving the training efficiency of specific driving scenarios in the US. Results show that the proposed approach significantly improves the prediction transfer process.","PeriodicalId":426514,"journal":{"name":"Intelligence & Robotics","volume":"210 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligence & Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20517/ir.2023.23","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

A significant challenge in self-driving technology involves the domain-specific training of prediction models on intentions of other surrounding vehicles. Separately processing domain-specific models requires substantial human resources, time, and equipment for data collection and training. For instance, substantial difficulties arise when directly applying a prediction model developed with data from China to the United States market due to complex factors such as differing driving behaviors and traffic rules. The emergence of transfer learning seems to offer solutions, enabling the reuse of models and data to enhance prediction efficiency across international markets. However, many transfer learning methods require a comparison between source and target data domains to determine what can be transferred, a process that can often be legally restricted. A specialized area of transfer learning, known as network-based transfer, could potentially provide a solution. This approach involves pre-training and fine-tuning "student" models using selected parameters from a "teacher" model. However, as networks typically have a large number of parameters, it raises questions about the most efficient methods for parameter selection to optimize transfer learning. An automatic parameter selector through reinforcement learning has been developed in this paper, named "Automatic Transfer Selector via Reinforcement Learning". This technique enhances the efficiency of parameter selection for transfer prediction between international self-driving markets, in contrast to manual methods. With this innovative approach, technicians are relieved from the labor-intensive task of testing each parameter combination, or enduring lengthy training periods to evaluate the impact of prediction transfer. Experiments have been conducted using a temporal convolutional neural network fully trained with the data from the Chinese market and one month's US data, focusing on improving the training efficiency of specific driving scenarios in the US. Results show that the proposed approach significantly improves the prediction transfer process.

查看原文本刊更多论文

基于网络的传输参数选择的强化学习方法

自动驾驶技术面临的一个重大挑战是对预测模型进行特定领域的训练，以了解周围其他车辆的意图。单独处理特定于领域的模型需要大量的人力资源、时间和设备来进行数据收集和培训。例如，由于不同的驾驶行为和交通规则等复杂因素，当直接将中国数据开发的预测模型应用于美国市场时，会出现很大的困难。迁移学习的出现似乎提供了解决方案，使模型和数据的重用能够提高国际市场的预测效率。然而，许多迁移学习方法需要在源数据域和目标数据域之间进行比较，以确定可以迁移的内容，这一过程通常受到法律限制。迁移学习的一个专门领域，被称为基于网络的迁移，可能会提供一个解决方案。这种方法包括使用从“教师”模型中选择的参数对“学生”模型进行预训练和微调。然而，由于网络通常具有大量的参数，这就提出了最有效的参数选择方法来优化迁移学习的问题。本文开发了一种基于强化学习的自动参数选择器，命名为“基于强化学习的自动传递选择器”。与人工方法相比，该技术提高了国际自动驾驶市场之间转移预测的参数选择效率。通过这种创新的方法，技术人员可以从测试每个参数组合的劳动密集型任务中解脱出来，或者忍受长时间的培训来评估预测转移的影响。利用中国市场数据和美国一个月的数据进行充分训练的时间卷积神经网络进行实验，重点提高美国特定驾驶场景的训练效率。结果表明，该方法显著改善了预测传递过程。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Intelligence & Robotics

自引率

0.00%

发文量