顺序多智能体探索一个共同的目标

Web Intell. Agent Syst. Pub Date : 2013-07-01 DOI:10.3233/WIA-130272

Igor Rochlin, David Sarne, G. Zussman

{"title":"顺序多智能体探索一个共同的目标","authors":"Igor Rochlin, David Sarne, G. Zussman","doi":"10.3233/WIA-130272","DOIUrl":null,"url":null,"abstract":"Motivated by applications in Dynamic Spectrum Access Networks, we focus on a system in which a few agents are engaged in a costly individual exploration process where each agent's benefit is determined according to the minimum obtained value. Such an exploration pattern is applicable to many systems, including shipment and travel planning. This paper formally introduces and analyzes a sequential variant of the general model. According to that variant, only a single agent engages in exploration at any given time, and when an agent initiates its exploration, it has complete information about the minimum value obtained by the other agents so far. We prove that the exploration strategy of each agent, according to the equilibrium of the resulting Stackelberg game, is reservation-value based, and show how the reservation values can be calculated. We also analyze the agents' expected-benefit maximizing exploration strategies when they are fully cooperative i.e., when they aim to maximize the expected joint benefit. The equilibrium strategies and the expected benefit of each agent are illustrated using a synthetic homogeneous environment, thereby demonstrating the properties of this new exploration scheme and the benefits of cooperation.","PeriodicalId":263450,"journal":{"name":"Web Intell. Agent Syst.","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Sequential multi-agent exploration for a common goal\",\"authors\":\"Igor Rochlin, David Sarne, G. Zussman\",\"doi\":\"10.3233/WIA-130272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motivated by applications in Dynamic Spectrum Access Networks, we focus on a system in which a few agents are engaged in a costly individual exploration process where each agent's benefit is determined according to the minimum obtained value. Such an exploration pattern is applicable to many systems, including shipment and travel planning. This paper formally introduces and analyzes a sequential variant of the general model. According to that variant, only a single agent engages in exploration at any given time, and when an agent initiates its exploration, it has complete information about the minimum value obtained by the other agents so far. We prove that the exploration strategy of each agent, according to the equilibrium of the resulting Stackelberg game, is reservation-value based, and show how the reservation values can be calculated. We also analyze the agents' expected-benefit maximizing exploration strategies when they are fully cooperative i.e., when they aim to maximize the expected joint benefit. The equilibrium strategies and the expected benefit of each agent are illustrated using a synthetic homogeneous environment, thereby demonstrating the properties of this new exploration scheme and the benefits of cooperation.\",\"PeriodicalId\":263450,\"journal\":{\"name\":\"Web Intell. Agent Syst.\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Web Intell. Agent Syst.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3233/WIA-130272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Web Intell. Agent Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/WIA-130272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

受动态频谱接入网络应用的启发，我们重点研究了一个系统，在这个系统中，几个智能体参与一个昂贵的单独探索过程，每个智能体的利益是根据最小获得值来确定的。这种探索模式适用于许多系统，包括运输和旅行计划。本文正式介绍并分析了通用模型的一个序列变体。根据该变体，在任何给定时间只有一个智能体参与探索，当一个智能体开始探索时，它拥有关于其他智能体迄今为止获得的最小值的完整信息。根据得到的Stackelberg博弈的均衡性，我们证明了每个智能体的探索策略是基于保留值的，并展示了如何计算保留值。我们还分析了agent完全合作时的期望利益最大化的勘探策略，即以期望共同利益最大化为目标。利用一个合成的同构环境来说明各agent的均衡策略和预期收益，从而展示了这种新探索方案的特性和合作的收益。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sequential multi-agent exploration for a common goal

Motivated by applications in Dynamic Spectrum Access Networks, we focus on a system in which a few agents are engaged in a costly individual exploration process where each agent's benefit is determined according to the minimum obtained value. Such an exploration pattern is applicable to many systems, including shipment and travel planning. This paper formally introduces and analyzes a sequential variant of the general model. According to that variant, only a single agent engages in exploration at any given time, and when an agent initiates its exploration, it has complete information about the minimum value obtained by the other agents so far. We prove that the exploration strategy of each agent, according to the equilibrium of the resulting Stackelberg game, is reservation-value based, and show how the reservation values can be calculated. We also analyze the agents' expected-benefit maximizing exploration strategies when they are fully cooperative i.e., when they aim to maximize the expected joint benefit. The equilibrium strategies and the expected benefit of each agent are illustrated using a synthetic homogeneous environment, thereby demonstrating the properties of this new exploration scheme and the benefits of cooperation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Web Intell. Agent Syst.

自引率

0.00%

发文量