arXiv - CS - Artificial Intelligence最新文献_第10页

NAVINACT: Combining Navigation and Imitation Learning for Bootstrapping Reinforcement Learning NAVINACT：结合导航和模仿学习以引导强化学习

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-07 DOI: arxiv-2408.04054

Amisha Bhaskar, Zahiruddin Mahammad, Sachin R Jadhav, Pratap Tokekar

{"title":"NAVINACT: Combining Navigation and Imitation Learning for Bootstrapping Reinforcement Learning","authors":"Amisha Bhaskar, Zahiruddin Mahammad, Sachin R Jadhav, Pratap Tokekar","doi":"arxiv-2408.04054","DOIUrl":"https://doi.org/arxiv-2408.04054","url":null,"abstract":"Reinforcement Learning (RL) has shown remarkable progress in simulation\u0000environments, yet its application to real-world robotic tasks remains limited\u0000due to challenges in exploration and generalisation. To address these issues,\u0000we introduce NAVINACT, a framework that chooses when the robot should use\u0000classical motion planning-based navigation and when it should learn a policy.\u0000To further improve the efficiency in exploration, we use imitation data to\u0000bootstrap the exploration. NAVINACT dynamically switches between two modes of\u0000operation: navigating to a waypoint using classical techniques when away from\u0000the objects and reinforcement learning for fine-grained manipulation control\u0000when about to interact with objects. NAVINACT consists of a multi-head\u0000architecture composed of ModeNet for mode classification, NavNet for waypoint\u0000prediction, and InteractNet for precise manipulation. By combining the\u0000strengths of RL and Imitation Learning (IL), NAVINACT improves sample\u0000efficiency and mitigates distribution shift, ensuring robust task execution. We\u0000evaluate our approach across multiple challenging simulation environments and\u0000real-world tasks, demonstrating superior performance in terms of adaptability,\u0000efficiency, and generalization compared to existing methods. In both simulated\u0000and real-world settings, NAVINACT demonstrates robust performance. In\u0000simulations, NAVINACT surpasses baseline methods by 10-15% in training success\u0000rates at 30k samples and by 30-40% during evaluation phases. In real-world\u0000scenarios, it demonstrates a 30-40% higher success rate on simpler tasks\u0000compared to baselines and uniquely succeeds in complex, two-stage manipulation\u0000tasks. Datasets and supplementary materials can be found on our website:\u0000{https://raaslab.org/projects/NAVINACT/}.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"56 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic 基于自适应延迟启发式的随时多代理路径搜索

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-06 DOI: arxiv-2408.02960

Thomy Phan, Benran Zhang, Shao-Hung Chan, Sven Koenig

{"title":"Anytime Multi-Agent Path Finding with an Adaptive Delay-Based Heuristic","authors":"Thomy Phan, Benran Zhang, Shao-Hung Chan, Sven Koenig","doi":"arxiv-2408.02960","DOIUrl":"https://doi.org/arxiv-2408.02960","url":null,"abstract":"Anytime multi-agent path finding (MAPF) is a promising approach to scalable\u0000path optimization in multi-agent systems. MAPF-LNS, based on Large Neighborhood\u0000Search (LNS), is the current state-of-the-art approach where a fast initial\u0000solution is iteratively optimized by destroying and repairing selected paths of\u0000the solution. Current MAPF-LNS variants commonly use an adaptive selection\u0000mechanism to choose among multiple destroy heuristics. However, to determine\u0000promising destroy heuristics, MAPF-LNS requires a considerable amount of\u0000exploration time. As common destroy heuristics are non-adaptive, any\u0000performance bottleneck caused by these heuristics cannot be overcome via\u0000adaptive heuristic selection alone, thus limiting the overall effectiveness of\u0000MAPF-LNS in terms of solution cost. In this paper, we propose Adaptive\u0000Delay-based Destroy-and-Repair Enhanced with Success-based Self-Learning\u0000(ADDRESS) as a single-destroy-heuristic variant of MAPF-LNS. ADDRESS applies\u0000restricted Thompson Sampling to the top-K set of the most delayed agents to\u0000select a seed agent for adaptive LNS neighborhood generation. We evaluate\u0000ADDRESS in multiple maps from the MAPF benchmark set and demonstrate cost\u0000improvements by at least 50% in large-scale scenarios with up to a thousand\u0000agents, compared with the original MAPF-LNS and other state-of-the-art methods.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction 为波兰语开发 PUGG：构建 KBQA、MRC 和 IR 数据集的现代方法

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-05 DOI: arxiv-2408.02337

Albert Sawczyn, Katsiaryna Viarenich, Konrad Wojtasik, Aleksandra Domogała, Marcin Oleksy, Maciej Piasecki, Tomasz Kajdanowicz

{"title":"Developing PUGG for Polish: A Modern Approach to KBQA, MRC, and IR Dataset Construction","authors":"Albert Sawczyn, Katsiaryna Viarenich, Konrad Wojtasik, Aleksandra Domogała, Marcin Oleksy, Maciej Piasecki, Tomasz Kajdanowicz","doi":"arxiv-2408.02337","DOIUrl":"https://doi.org/arxiv-2408.02337","url":null,"abstract":"Advancements in AI and natural language processing have revolutionized\u0000machine-human language interactions, with question answering (QA) systems\u0000playing a pivotal role. The knowledge base question answering (KBQA) task,\u0000utilizing structured knowledge graphs (KG), allows for handling extensive\u0000knowledge-intensive questions. However, a significant gap exists in KBQA\u0000datasets, especially for low-resource languages. Many existing construction\u0000pipelines for these datasets are outdated and inefficient in human labor, and\u0000modern assisting tools like Large Language Models (LLM) are not utilized to\u0000reduce the workload. To address this, we have designed and implemented a\u0000modern, semi-automated approach for creating datasets, encompassing tasks such\u0000as KBQA, Machine Reading Comprehension (MRC), and Information Retrieval (IR),\u0000tailored explicitly for low-resource environments. We executed this pipeline\u0000and introduced the PUGG dataset, the first Polish KBQA dataset, and novel\u0000datasets for MRC and IR. Additionally, we provide a comprehensive\u0000implementation, insightful findings, detailed statistics, and evaluation of\u0000baseline models.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Counterfactual Shapley Values for Explaining Reinforcement Learning 解释强化学习的反事实夏普利值

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-05 DOI: arxiv-2408.02529

Yiwei Shi, Qi Zhang, Kevin McAreavey, Weiru Liu

引用次数: 0

Perfect Information Monte Carlo with Postponing Reasoning 完美信息蒙特卡洛与延迟推理

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-05 DOI: arxiv-2408.02380

Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave

{"title":"Perfect Information Monte Carlo with Postponing Reasoning","authors":"Jérôme Arjonilla, Abdallah Saffidine, Tristan Cazenave","doi":"arxiv-2408.02380","DOIUrl":"https://doi.org/arxiv-2408.02380","url":null,"abstract":"Imperfect information games, such as Bridge and Skat, present challenges due\u0000to state-space explosion and hidden information, posing formidable obstacles\u0000for search algorithms. Determinization-based algorithms offer a resolution by\u0000sampling hidden information and solving the game in a perfect information\u0000setting, facilitating rapid and effective action estimation. However,\u0000transitioning to perfect information introduces challenges, notably one called\u0000strategy fusion.This research introduces `Extended Perfect Information Monte\u0000Carlo' (EPIMC), an online algorithm inspired by the state-of-the-art\u0000determinization-based approach Perfect Information Monte Carlo (PIMC). EPIMC\u0000enhances the capabilities of PIMC by postponing the perfect information\u0000resolution, reducing alleviating issues related to strategy fusion. However,\u0000the decision to postpone the leaf evaluator introduces novel considerations,\u0000such as the interplay between prior levels of reasoning and the newly deferred\u0000resolution. In our empirical analysis, we investigate the performance of EPIMC\u0000across a range of games, with a particular focus on those characterized by\u0000varying degrees of strategy fusion. Our results demonstrate notable performance\u0000enhancements, particularly in games where strategy fusion significantly impacts\u0000gameplay. Furthermore, our research contributes to the theoretical foundation\u0000of determinization-based algorithms addressing challenges associated with\u0000strategy fusion.%, thereby enhancing our understanding of these algorithms\u0000within the context of imperfect information game scenarios.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"15 Suppl 1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Operationalizing Contextual Integrity in Privacy-Conscious Assistants 在具有隐私意识的助手中操作情境完整性

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-05 DOI: arxiv-2408.02373

Sahra Ghalebikesabi, Eugene Bagdasaryan, Ren Yi, Itay Yona, Ilia Shumailov, Aneesh Pappu, Chongyang Shi, Laura Weidinger, Robert Stanforth, Leonard Berrada, Pushmeet Kohli, Po-Sen Huang, Borja Balle

引用次数: 0

Development of REGAI: Rubric Enabled Generative Artificial Intelligence 开发 REGAI：Rubric Enabled Generative Artificial Intelligence（评分标准支持的生成式人工智能

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-05 DOI: arxiv-2408.02811

Zach Johnson, Jeremy Straub

引用次数: 0

SR-CIS: Self-Reflective Incremental System with Decoupled Memory and Reasoning SR-CIS：记忆与推理解耦的自反递增系统

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-04 DOI: arxiv-2408.01970

Biqing Qi, Junqi Gao, Xinquan Chen, Dong Li, Weinan Zhang, Bowen Zhou

{"title":"SR-CIS: Self-Reflective Incremental System with Decoupled Memory and Reasoning","authors":"Biqing Qi, Junqi Gao, Xinquan Chen, Dong Li, Weinan Zhang, Bowen Zhou","doi":"arxiv-2408.01970","DOIUrl":"https://doi.org/arxiv-2408.01970","url":null,"abstract":"The ability of humans to rapidly learn new knowledge while retaining old\u0000memories poses a significant challenge for current deep learning models. To\u0000handle this challenge, we draw inspiration from human memory and learning\u0000mechanisms and propose the Self-Reflective Complementary Incremental System\u0000(SR-CIS). Comprising the deconstructed Complementary Inference Module (CIM) and\u0000Complementary Memory Module (CMM), SR-CIS features a small model for fast\u0000inference and a large model for slow deliberation in CIM, enabled by the\u0000Confidence-Aware Online Anomaly Detection (CA-OAD) mechanism for efficient\u0000collaboration. CMM consists of task-specific Short-Term Memory (STM) region and\u0000a universal Long-Term Memory (LTM) region. By setting task-specific Low-Rank\u0000Adaptive (LoRA) and corresponding prototype weights and biases, it instantiates\u0000external storage for parameter and representation memory, thus deconstructing\u0000the memory module from the inference module. By storing textual descriptions of\u0000images during training and combining them with the Scenario Replay Module (SRM)\u0000post-training for memory combination, along with periodic short-to-long-term\u0000memory restructuring, SR-CIS achieves stable incremental memory with limited\u0000storage requirements. Balancing model plasticity and memory stability under\u0000constraints of limited storage and low data resources, SR-CIS surpasses\u0000existing competitive baselines on multiple standard and few-shot incremental\u0000learning benchmarks.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visual Grounding for Object-Level Generalization in Reinforcement Learning 强化学习中对象级泛化的视觉基础

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-04 DOI: arxiv-2408.01942

Haobin Jiang, Zongqing Lu

{"title":"Visual Grounding for Object-Level Generalization in Reinforcement Learning","authors":"Haobin Jiang, Zongqing Lu","doi":"arxiv-2408.01942","DOIUrl":"https://doi.org/arxiv-2408.01942","url":null,"abstract":"Generalization is a pivotal challenge for agents following natural language\u0000instructions. To approach this goal, we leverage a vision-language model (VLM)\u0000for visual grounding and transfer its vision-language knowledge into\u0000reinforcement learning (RL) for object-centric tasks, which makes the agent\u0000capable of zero-shot generalization to unseen objects and instructions. By\u0000visual grounding, we obtain an object-grounded confidence map for the target\u0000object indicated in the instruction. Based on this map, we introduce two routes\u0000to transfer VLM knowledge into RL. Firstly, we propose an object-grounded\u0000intrinsic reward function derived from the confidence map to more effectively\u0000guide the agent towards the target object. Secondly, the confidence map offers\u0000a more unified, accessible task representation for the agent's policy, compared\u0000to language embeddings. This enables the agent to process unseen objects and\u0000instructions through comprehensible visual confidence maps, facilitating\u0000zero-shot object-level generalization. Single-task experiments prove that our\u0000intrinsic reward significantly improves performance on challenging skill\u0000learning. In multi-task experiments, through testing on tasks beyond the\u0000training set, we show that the agent, when provided with the confidence map as\u0000the task representation, possesses better generalization capabilities than\u0000language-based conditioning. The code is available at\u0000https://github.com/PKU-RL/COPL.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data 整合大型语言模型和知识图谱，提取和验证文本测试数据

arXiv - CS - Artificial Intelligence Pub Date : 2024-08-03 DOI: arxiv-2408.01700

Antonio De Santis, Marco Balduini, Federico De Santis, Andrea Proia, Arsenio Leo, Marco Brambilla, Emanuele Della Valle

{"title":"Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data","authors":"Antonio De Santis, Marco Balduini, Federico De Santis, Andrea Proia, Arsenio Leo, Marco Brambilla, Emanuele Della Valle","doi":"arxiv-2408.01700","DOIUrl":"https://doi.org/arxiv-2408.01700","url":null,"abstract":"Aerospace manufacturing companies, such as Thales Alenia Space, design,\u0000develop, integrate, verify, and validate products characterized by high\u0000complexity and low volume. They carefully document all phases for each product\u0000but analyses across products are challenging due to the heterogeneity and\u0000unstructured nature of the data in documents. In this paper, we propose a\u0000hybrid methodology that leverages Knowledge Graphs (KGs) in conjunction with\u0000Large Language Models (LLMs) to extract and validate data contained in these\u0000documents. We consider a case study focused on test data related to electronic\u0000boards for satellites. To do so, we extend the Semantic Sensor Network\u0000ontology. We store the metadata of the reports in a KG, while the actual test\u0000results are stored in parquet accessible via a Virtual Knowledge Graph. The\u0000validation process is managed using an LLM-based approach. We also conduct a\u0000benchmarking study to evaluate the performance of state-of-the-art LLMs in\u0000executing this task. Finally, we analyze the costs and benefits of automating\u0000preexisting processes of manual data extraction and validation for subsequent\u0000cross-report analyses.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0