ACM Transactions on Autonomous and Adaptive Systems (TAAS)最新文献_第3页

Prosocial Norm Emergence in Multi-agent Systems 多智能体系统中的亲社会规范出现

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-12-29 DOI: 10.1145/3540202

M. Mashayekhi, Nirav Ajmeri, G. List, Munindar P. Singh

{"title":"Prosocial Norm Emergence in Multi-agent Systems","authors":"M. Mashayekhi, Nirav Ajmeri, G. List, Munindar P. Singh","doi":"10.1145/3540202","DOIUrl":"https://doi.org/10.1145/3540202","url":null,"abstract":"Multi-agent systems provide a basis for developing systems of autonomous entities and thus find application in a variety of domains. We consider a setting where not only the member agents are adaptive but also the multi-agent system viewed as an entity in its own right is adaptive. Specifically, the social structure of a multi-agent system can be reflected in the social norms among its members. It is well recognized that the norms that arise in society are not always beneficial to its members. We focus on prosocial norms, which help achieve positive outcomes for society and often provide guidance to agents to act in a manner that takes into account the welfare of others. Specifically, we propose Cha, a framework for the emergence of prosocial norms. Unlike previous norm emergence approaches, Cha supports continual change to a system (agents may enter and leave) and dynamism (norms may change when the environment changes). Importantly, Cha agents incorporate prosocial decision-making based on inequity aversion theory, reflecting an intuition of guilt arising from being antisocial. In this manner, Cha brings together two important themes in prosociality: decision-making by individuals and fairness of system-level outcomes. We demonstrate via simulation that Cha can improve aggregate societal gains and fairness of outcomes.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123965992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint 预算约束下多智能体强化学习的q值共享框架

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-11-29 DOI: 10.1145/3447268

Changxi Zhu, Ho-fung Leung, Shuyue Hu, Yi Cai

{"title":"A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint","authors":"Changxi Zhu, Ho-fung Leung, Shuyue Hu, Yi Cai","doi":"10.1145/3447268","DOIUrl":"https://doi.org/10.1145/3447268","url":null,"abstract":"In a teacher-student framework, a more experienced agent (teacher) helps accelerate the learning of another agent (student) by suggesting actions to take in certain states. In cooperative multi-agent reinforcement learning (MARL), where agents must cooperate with one another, a student could fail to cooperate effectively with others even by following a teacher’s suggested actions, as the policies of all agents can change before convergence. When the number of times that agents communicate with one another is limited (i.e., there are budget constraints), an advising strategy that uses actions as advice could be less effective. We propose a partaker-sharer advising framework (PSAF) for cooperative MARL agents learning with budget constraints. In PSAF, each Q-learner can decide when to ask for and share its Q-values. We perform experiments in three typical multi-agent learning problems. The evaluation results indicate that the proposed PSAF approach outperforms existing advising methods under both constrained and unconstrained budgets. Moreover, we analyse the influence of advising actions and sharing Q-values on agent learning.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122824670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

HAMLET: A Hierarchical Agent-based Machine Learning Platform 哈姆雷特:一个基于分层代理的机器学习平台

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-10-10 DOI: 10.1145/3530191

Ahmad Esmaeili, J. Gallagher, John A. Springer, E. Matson

{"title":"HAMLET: A Hierarchical Agent-based Machine Learning Platform","authors":"Ahmad Esmaeili, J. Gallagher, John A. Springer, E. Matson","doi":"10.1145/3530191","DOIUrl":"https://doi.org/10.1145/3530191","url":null,"abstract":"Hierarchical Multi-agent Systems provide convenient and relevant ways to analyze, model, and simulate complex systems composed of a large number of entities that interact at different levels of abstraction. In this article, we introduce HAMLET (Hierarchical Agent-based Machine LEarning plaTform), a hybrid machine learning platform based on hierarchical multi-agent systems, to facilitate the research and democratization of geographically and/or locally distributed machine learning entities. The proposed system models machine learning solutions as a hypergraph and autonomously sets up a multi-level structure of heterogeneous agents based on their innate capabilities and learned skills. HAMLET aids the design and management of machine learning systems and provides analytical capabilities for research communities to assess the existing and/or new algorithms/datasets through flexible and customizable queries. The proposed hybrid machine learning platform does not assume restrictions on the type of learning algorithms/datasets and is theoretically proven to be sound and complete with polynomial computational requirements. Additionally, it is examined empirically on 120 training and 4 generalized batch testing tasks performed on 24 machine learning algorithms and 9 standard datasets. The provided experimental results not only establish confidence in the platform’s consistency and correctness but also demonstrate its testing and analytical capacity.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129490076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Applying Machine Learning in Self-adaptive Systems 机器学习在自适应系统中的应用

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-09-30 DOI: 10.1145/3469440

Omid Gheibi, Danny Weyns, Federico Quin

{"title":"Applying Machine Learning in Self-adaptive Systems","authors":"Omid Gheibi, Danny Weyns, Federico Quin","doi":"10.1145/3469440","DOIUrl":"https://doi.org/10.1145/3469440","url":null,"abstract":"Recently, we have been witnessing a rapid increase in the use of machine learning techniques in self-adaptive systems. Machine learning has been used for a variety of reasons, ranging from learning a model of the environment of a system during operation to filtering large sets of possible configurations before analyzing them. While a body of work on the use of machine learning in self-adaptive systems exists, there is currently no systematic overview of this area. Such an overview is important for researchers to understand the state of the art and direct future research efforts. This article reports the results of a systematic literature review that aims at providing such an overview. We focus on self-adaptive systems that are based on a traditional Monitor-Analyze-Plan-Execute (MAPE)-based feedback loop. The research questions are centered on the problems that motivate the use of machine learning in self-adaptive systems, the key engineering aspects of learning in self-adaptation, and open challenges in this area. The search resulted in 6,709 papers, of which 109 were retained for data collection. Analysis of the collected data shows that machine learning is mostly used for updating adaptation rules and policies to improve system qualities, and managing resources to better balance qualities and resources. These problems are primarily solved using supervised and interactive learning with classification, regression, and reinforcement learning as the dominant methods. Surprisingly, unsupervised learning that naturally fits automation is only applied in a small number of studies. Key open challenges in this area include the performance of learning, managing the effects of learning, and dealing with more complex types of goals. From the insights derived from this systematic literature review, we outline an initial design process for applying machine learning in self-adaptive systems that are based on MAPE feedback loops.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126707319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 41

Finding the Largest Successful Coalition under the Strict Goal Preferences of Agents 在主体严格目标偏好下寻找最大成功联盟

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-09-13 DOI: 10.1145/3412370

Zhaopin Su, Guofu Zhang, Feng Yue, Jindong He, M. Li, Bin Li, X. Yao

{"title":"Finding the Largest Successful Coalition under the Strict Goal Preferences of Agents","authors":"Zhaopin Su, Guofu Zhang, Feng Yue, Jindong He, M. Li, Bin Li, X. Yao","doi":"10.1145/3412370","DOIUrl":"https://doi.org/10.1145/3412370","url":null,"abstract":"Coalition formation has been a fundamental form of resource cooperation for achieving joint goals in multiagent systems. Most existing studies still focus on the traditional assumption that an agent has to contribute its resources to all the goals, even if the agent is not interested in the goal at all. In this article, a natural extension of the traditional coalitional resource games (CRGs) is studied from both theoretical and empirical perspectives, in which each agent has uncompromising, personalized preferences over goals. Specifically, a new CRGs model with agents’ strict preferences for goals is presented, in which an agent is willing to contribute its resources only to the goals that are in its own interest set. The computational complexity of the basic decision problems surrounding the successful coalition is reinvestigated. The results suggest that these problems in such a strict preference way are complex and intractable. To find the largest successful coalition for possible computation reduction or potential parallel processing, a flow-network–based exhaust algorithm, called FNetEA, is proposed to achieve the optimal solution. Then, to solve the problem more efficiently, a hybrid algorithm, named 2D-HA, is developed to find the approximately optimal solution on the basis of genetic algorithm, two-dimensional (2D) solution representation, and a heuristic for solution repairs. Through extensive experiments, the 2D-HA algorithm exhibits the prominent ability to provide reassurances that the optimal solution could be found within a reasonable period of time, even in a super-large-scale space.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132891859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Human Feedback as Action Assignment in Interactive Reinforcement Learning 交互式强化学习中的人类反馈行为分配

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-08-04 DOI: 10.1145/3404197

S. Raza, Mary-Anne Williams

{"title":"Human Feedback as Action Assignment in Interactive Reinforcement Learning","authors":"S. Raza, Mary-Anne Williams","doi":"10.1145/3404197","DOIUrl":"https://doi.org/10.1145/3404197","url":null,"abstract":"Teaching by demonstrations and teaching by assigning rewards are two popular methods of knowledge transfer in humans. However, showing the right behaviour (by demonstration) may appear more natural to a human teacher than assessing the learner’s performance and assigning a reward or punishment to it. In the context of robot learning, the preference between these two approaches has not been studied extensively. In this article, we propose a method that replaces the traditional method of reward assignment with action assignment (which is similar to providing a demonstration) in interactive reinforcement learning. The main purpose of the suggested action is to compute a reward by seeing if the suggested action was followed by the self-acting agent or not. We compared action assignment with reward assignment via a user study conducted over the web using a two-dimensional maze game. The logs of interactions showed that action assignment significantly improved users’ ability to teach the right behaviour. The survey results showed that both action and reward assignment seemed highly natural and usable, reward assignment required more mental effort, repeatedly assigning rewards and seeing the agent disobey commands caused frustration in users, and many users desired to control the agent’s behaviour directly.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124870560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

UAVs vs. Pirates 无人机vs海盗

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-08-04 DOI: 10.1145/3380782

Ruiwen Zhang, T. Holvoet, Bifeng Song, Y. Pei

{"title":"UAVs vs. Pirates","authors":"Ruiwen Zhang, T. Holvoet, Bifeng Song, Y. Pei","doi":"10.1145/3380782","DOIUrl":"https://doi.org/10.1145/3380782","url":null,"abstract":"For the rising hazard of pirate attacks, unmanned aerial vehicle (UAV) swarm monitoring is a promising countermeasure. Previous monitoring methods have deficiencies in either adaptivity to dynamic events or simple but effective path coordination mechanisms, and they are inapplicable to the large-area, low-target-density, and long-duration persistent counter-piracy monitoring. This article proposes a self-organized UAV swarm counter-piracy monitoring method. Based on the pheromone map, this method is characterized by (1) a reservation mechanism for anticipatory path coordination and (2) a ship-adaptive mechanism for adapting to merchant ship distributions. A heuristic depth-first branch and bound search algorithm is designed for solving individual path planning. Simulation experiments are conducted to study the optimal number of plan steps and adaptivity scaling factor for different numbers of UAVs. Results show that merely decreasing revisit intervals cannot effectively reduce pirate attacks. Without the ship-adaptive mechanism, the proposed method reduces up to 87.2%, 43.2%, and 5.5% of revisit intervals compared to the Lèvy Walk method, the sweep method, and the baseline self-organized method, respectively, but cannot reduce pirate attacks; while with the ship-adaptive mechanism, the proposed method can reduce pirate attacks by up to 6.7% compared to the best of the baseline methods.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115969279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

A Bike-sharing Optimization Framework Combining Dynamic Rebalancing and User Incentives 结合动态再平衡和用户激励的共享单车优化框架

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-02-25 DOI: 10.1145/3376923

Federico Chiariotti, Chiara Pielli, A. Zanella, M. Zorzi

引用次数: 28

Improving Scalability and Reward of Utility-Driven Self-Healing for Large Dynamic Architectures 改进大型动态架构中效用驱动的自修复的可扩展性和奖励

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-02-25 DOI: 10.1145/3380965

Sona Ghahremani, H. Giese, T. Vogel

{"title":"Improving Scalability and Reward of Utility-Driven Self-Healing for Large Dynamic Architectures","authors":"Sona Ghahremani, H. Giese, T. Vogel","doi":"10.1145/3380965","DOIUrl":"https://doi.org/10.1145/3380965","url":null,"abstract":"Self-adaptation can be realized in various ways. Rule-based approaches prescribe the adaptation to be executed if the system or environment satisfies certain conditions. They result in scalable solutions but often with merely satisfying adaptation decisions. In contrast, utility-driven approaches determine optimal decisions by using an often costly optimization, which typically does not scale for large problems. We propose a rule-based and utility-driven adaptation scheme that achieves the benefits of both directions such that the adaptation decisions are optimal, whereas the computation scales by avoiding an expensive optimization. We use this adaptation scheme for architecture-based self-healing of large software systems. For this purpose, we define the utility for large dynamic architectures of such systems based on patterns that define issues the self-healing must address. Moreover, we use pattern-based adaptation rules to resolve these issues. Using a pattern-based scheme to define the utility and adaptation rules allows us to compute the impact of each rule application on the overall utility and to realize an incremental and efficient utility-driven self-healing. In addition to formally analyzing the computational effort and optimality of the proposed scheme, we thoroughly demonstrate its scalability and optimality in terms of reward in comparative experiments with a static rule-based approach as a baseline and a utility-driven approach using a constraint solver. These experiments are based on different failure profiles derived from real-world failure logs. We also investigate the impact of different failure profile characteristics on the scalability and reward to evaluate the robustness of the different approaches.","PeriodicalId":377078,"journal":{"name":"ACM Transactions on Autonomous and Adaptive Systems (TAAS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117086088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Argumentation-Based Reasoning about Plans, Maintenance Goals, and Norms 关于计划、维护目标和规范的基于论证的推理

ACM Transactions on Autonomous and Adaptive Systems (TAAS) Pub Date : 2020-02-10 DOI: 10.1145/3364220

Z. Shams, M. B. Vos, Nir Oren, J. Padget

引用次数: 11