arXiv - CS - Multiagent Systems最新文献_第4页

Managing multiple agents by automatically adjusting incentives 通过自动调整激励措施管理多个代理

arXiv - CS - Multiagent Systems Pub Date : 2024-09-03 DOI: arxiv-2409.02960

Shunichi Akatsuka, Yaemi Teramoto, Aaron Courville

引用次数: 0

AIvril: AI-Driven RTL Generation With Verification In-The-Loop AIvril：通过环内验证实现人工智能驱动的 RTL 生成

arXiv - CS - Multiagent Systems Pub Date : 2024-09-03 DOI: arxiv-2409.11411

Mubashir ul Islam, Humza Sami, Pierre-Emmanuel Gaillardon, Valerio Tenace

{"title":"AIvril: AI-Driven RTL Generation With Verification In-The-Loop","authors":"Mubashir ul Islam, Humza Sami, Pierre-Emmanuel Gaillardon, Valerio Tenace","doi":"arxiv-2409.11411","DOIUrl":"https://doi.org/arxiv-2409.11411","url":null,"abstract":"Large Language Models (LLMs) are computational models capable of performing\u0000complex natural language processing tasks. Leveraging these capabilities, LLMs\u0000hold the potential to transform the entire hardware design stack, with\u0000predictions suggesting that front-end and back-end tasks could be fully\u0000automated in the near future. Currently, LLMs show great promise in\u0000streamlining Register Transfer Level (RTL) generation, enhancing efficiency,\u0000and accelerating innovation. However, their probabilistic nature makes them\u0000prone to inaccuracies - a significant drawback in RTL design, where reliability\u0000and precision are essential. To address these challenges, this paper introduces AIvril, an advanced\u0000framework designed to enhance the accuracy and reliability of RTL-aware LLMs.\u0000AIvril employs a multi-agent, LLM-agnostic system for automatic syntax\u0000correction and functional verification, significantly reducing - and in many\u0000cases, completely eliminating - instances of erroneous code generation.\u0000Experimental results conducted on the VerilogEval-Human dataset show that our\u0000framework improves code quality by nearly 2x when compared to previous works,\u0000while achieving an 88.46% success rate in meeting verification objectives. This\u0000represents a critical step toward automating and optimizing hardware design\u0000workflows, offering a more dependable methodology for AI-driven RTL design.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"11 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evolution of Social Norms in LLM Agents using Natural Language 使用自然语言演化 LLM 代理中的社会规范

arXiv - CS - Multiagent Systems Pub Date : 2024-09-02 DOI: arxiv-2409.00993

Ilya Horiguchi, Takahide Yoshida, Takashi Ikegami

引用次数: 0

Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques 从人类反馈中进行多代理强化学习：数据覆盖和算法技术

arXiv - CS - Multiagent Systems Pub Date : 2024-09-01 DOI: arxiv-2409.00717

Natalia Zhang, Xinqi Wang, Qiwen Cui, Runlong Zhou, Sham M. Kakade, Simon S. Du

{"title":"Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques","authors":"Natalia Zhang, Xinqi Wang, Qiwen Cui, Runlong Zhou, Sham M. Kakade, Simon S. Du","doi":"arxiv-2409.00717","DOIUrl":"https://doi.org/arxiv-2409.00717","url":null,"abstract":"We initiate the study of Multi-Agent Reinforcement Learning from Human\u0000Feedback (MARLHF), exploring both theoretical foundations and empirical\u0000validations. We define the task as identifying Nash equilibrium from a\u0000preference-only offline dataset in general-sum games, a problem marked by the\u0000challenge of sparse feedback signals. Our theory establishes the upper\u0000complexity bounds for Nash Equilibrium in effective MARLHF, demonstrating that\u0000single-policy coverage is inadequate and highlighting the importance of\u0000unilateral dataset coverage. These theoretical insights are verified through\u0000comprehensive experiments. To enhance the practical performance, we further\u0000introduce two algorithmic techniques. (1) We propose a Mean Squared Error (MSE)\u0000regularization along the time axis to achieve a more uniform reward\u0000distribution and improve reward learning outcomes. (2) We utilize imitation\u0000learning to approximate the reference policy, ensuring stability and\u0000effectiveness in training. Our findings underscore the multifaceted approach\u0000required for MARLHF, paving the way for effective preference-based multi-agent\u0000systems.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Learnable Agent Collaboration Network Framework for Personalized Multimodal AI Search Engine 用于个性化多模态人工智能搜索引擎的可学习代理协作网络框架

arXiv - CS - Multiagent Systems Pub Date : 2024-09-01 DOI: arxiv-2409.00636

Yunxiao Shi, Min Xu, Haimin Zhang, Xing Zi, Qiang Wu

{"title":"A Learnable Agent Collaboration Network Framework for Personalized Multimodal AI Search Engine","authors":"Yunxiao Shi, Min Xu, Haimin Zhang, Xing Zi, Qiang Wu","doi":"arxiv-2409.00636","DOIUrl":"https://doi.org/arxiv-2409.00636","url":null,"abstract":"Large language models (LLMs) and retrieval-augmented generation (RAG)\u0000techniques have revolutionized traditional information access, enabling AI\u0000agent to search and summarize information on behalf of users during dynamic\u0000dialogues. Despite their potential, current AI search engines exhibit\u0000considerable room for improvement in several critical areas. These areas\u0000include the support for multimodal information, the delivery of personalized\u0000responses, the capability to logically answer complex questions, and the\u0000facilitation of more flexible interactions. This paper proposes a novel AI\u0000Search Engine framework called the Agent Collaboration Network (ACN). The ACN\u0000framework consists of multiple specialized agents working collaboratively, each\u0000with distinct roles such as Account Manager, Solution Strategist, Information\u0000Manager, and Content Creator. This framework integrates mechanisms for picture\u0000content understanding, user profile tracking, and online evolution, enhancing\u0000the AI search engine's response quality, personalization, and interactivity. A\u0000highlight of the ACN is the introduction of a Reflective Forward Optimization\u0000method (RFO), which supports the online synergistic adjustment among agents.\u0000This feature endows the ACN with online learning capabilities, ensuring that\u0000the system has strong interactive flexibility and can promptly adapt to user\u0000feedback. This learning method may also serve as an optimization approach for\u0000agent-based systems, potentially influencing other domains of agent\u0000applications.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike? 加速基于代理的混合模型和模糊认知地图：如何将思维相似的代理结合起来？

arXiv - CS - Multiagent Systems Pub Date : 2024-09-01 DOI: arxiv-2409.00824

Philippe J. Giabbanelli, Jack T. Beerman

{"title":"Accelerating Hybrid Agent-Based Models and Fuzzy Cognitive Maps: How to Combine Agents who Think Alike?","authors":"Philippe J. Giabbanelli, Jack T. Beerman","doi":"arxiv-2409.00824","DOIUrl":"https://doi.org/arxiv-2409.00824","url":null,"abstract":"While Agent-Based Models can create detailed artificial societies based on\u0000individual differences and local context, they can be computationally\u0000intensive. Modelers may offset these costs through a parsimonious use of the\u0000model, for example by using smaller population sizes (which limits analyses in\u0000sub-populations), running fewer what-if scenarios, or accepting more\u0000uncertainty by performing fewer simulations. Alternatively, researchers may\u0000accelerate simulations via hardware solutions (e.g., GPU parallelism) or\u0000approximation approaches that operate a tradeoff between accuracy and compute\u0000time. In this paper, we present an approximation that combines agents who\u0000`think alike', thus reducing the population size and the compute time. Our\u0000innovation relies on representing agent behaviors as networks of rules (Fuzzy\u0000Cognitive Maps) and empirically evaluating different measures of distance\u0000between these networks. Then, we form groups of think-alike agents via\u0000community detection and simplify them to a representative agent. Case studies\u0000show that our simplifications remain accuracy.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"34 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance Analysis 识别和聚类 PvP 游戏中团队组合的对抗关系，实现高效的平衡分析

arXiv - CS - Multiagent Systems Pub Date : 2024-08-30 DOI: arxiv-2408.17180

Chiu-Chou Lin, Yu-Wei Shih, Kuei-Ting Kuo, Yu-Cheng Chen, Chien-Hua Chen, Wei-Chen Chiu, I-Chen Wu

{"title":"Identifying and Clustering Counter Relationships of Team Compositions in PvP Games for Efficient Balance Analysis","authors":"Chiu-Chou Lin, Yu-Wei Shih, Kuei-Ting Kuo, Yu-Cheng Chen, Chien-Hua Chen, Wei-Chen Chiu, I-Chen Wu","doi":"arxiv-2408.17180","DOIUrl":"https://doi.org/arxiv-2408.17180","url":null,"abstract":"How can balance be quantified in game settings? This question is crucial for\u0000game designers, especially in player-versus-player (PvP) games, where analyzing\u0000the strength relations among predefined team compositions-such as hero\u0000combinations in multiplayer online battle arena (MOBA) games or decks in card\u0000games-is essential for enhancing gameplay and achieving balance. We have\u0000developed two advanced measures that extend beyond the simplistic win rate to\u0000quantify balance in zero-sum competitive scenarios. These measures are derived\u0000from win value estimations, which employ strength rating approximations via the\u0000Bradley-Terry model and counter relationship approximations via vector\u0000quantization, significantly reducing the computational complexity associated\u0000with traditional win value estimations. Throughout the learning process of\u0000these models, we identify useful categories of compositions and pinpoint their\u0000counter relationships, aligning with the experiences of human players without\u0000requiring specific game knowledge. Our methodology hinges on a simple technique\u0000to enhance codebook utilization in discrete representation with a deterministic\u0000vector quantization process for an extremely small state space. Our framework\u0000has been validated in popular online games, including Age of Empires II,\u0000Hearthstone, Brawl Stars, and League of Legends. The accuracy of the observed\u0000strength relations in these games is comparable to traditional pairwise win\u0000value predictions, while also offering a more manageable complexity for\u0000analysis. Ultimately, our findings contribute to a deeper understanding of PvP\u0000game dynamics and present a methodology that significantly improves game\u0000balance evaluation and design.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale MAPF-GPT：多代理规模寻路的模仿学习

arXiv - CS - Multiagent Systems Pub Date : 2024-08-29 DOI: arxiv-2409.00134

Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik

{"title":"MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale","authors":"Anton Andreychuk, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik","doi":"arxiv-2409.00134","DOIUrl":"https://doi.org/arxiv-2409.00134","url":null,"abstract":"Multi-agent pathfinding (MAPF) is a challenging computational problem that\u0000typically requires to find collision-free paths for multiple agents in a shared\u0000environment. Solving MAPF optimally is NP-hard, yet efficient solutions are\u0000critical for numerous applications, including automated warehouses and\u0000transportation systems. Recently, learning-based approaches to MAPF have gained\u0000attention, particularly those leveraging deep reinforcement learning. Following\u0000current trends in machine learning, we have created a foundation model for the\u0000MAPF problems called MAPF-GPT. Using imitation learning, we have trained a\u0000policy on a set of pre-collected sub-optimal expert trajectories that can\u0000generate actions in conditions of partial observability without additional\u0000heuristics, reward functions, or communication with other agents. The resulting\u0000MAPF-GPT model demonstrates zero-shot learning abilities when solving the MAPF\u0000problem instances that were not present in the training dataset. We show that\u0000MAPF-GPT notably outperforms the current best-performing learnable-MAPF solvers\u0000on a diverse range of problem instances and is efficient in terms of\u0000computation (in the inference mode).","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Iterative Graph Alignment 迭代图对齐

arXiv - CS - Multiagent Systems Pub Date : 2024-08-29 DOI: arxiv-2408.16667

Fangyuan Yu, Hardeep Singh Arora, Matt Johnson

{"title":"Iterative Graph Alignment","authors":"Fangyuan Yu, Hardeep Singh Arora, Matt Johnson","doi":"arxiv-2408.16667","DOIUrl":"https://doi.org/arxiv-2408.16667","url":null,"abstract":"By compressing diverse narratives, LLMs go beyond memorization, achieving\u0000intelligence by capturing generalizable causal relationships. However, they\u0000suffer from local 'representation gaps' due to insufficient training data\u0000diversity, limiting their real-world utility, especially in tasks requiring\u0000strict alignment to rules. Traditional alignment methods relying on heavy human\u0000annotations are inefficient and unscalable. Recent self-alignment techniques\u0000also fall short, as they often depend on self-selection based prompting and\u0000memorization-based learning. To address these issues, we introduce Iterative\u0000Graph Alignment (IGA), an annotation-free rule-based alignment algorithm. A\u0000teacher model (VLM) employs Iterative Graph Prompting (IGP) to create logical\u0000graphs and reference answers. The student model (LLM) identifies local\u0000knowledge gaps by attempting to align its responses with these references,\u0000collaborating with helper models to generate diverse answers. These aligned\u0000responses are then used for iterative supervised fine-tuning (SFT). Our\u0000evaluations across five rule-based scenarios demonstrate IGP's effectiveness,\u0000with a 73.12% alignment improvement in Claude Sonnet 3.5, and\u0000Llama3-8B-Instruct achieving an 86.20% improvement, outperforming Claude\u0000Sonnet 3.5 in rule-based alignment.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142190516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Different Facets for Different Experts: A Framework for Streamlining The Integration of Qualitative Insights into ABM Development 不同专家不同面：简化将定性洞察纳入 ABM 开发的框架

arXiv - CS - Multiagent Systems Pub Date : 2024-08-28 DOI: arxiv-2408.15725

Vivek Nallur, Pedram Aghaei, Graham Finlay

引用次数: 0