Artif. Intell.最新文献

筛选
英文 中文
Entropy Estimation via Uniformization 均匀化熵估计
Artif. Intell. Pub Date : 2023-04-19 DOI: 10.48550/arXiv.2304.09700
Ziqiao Ao, Jinglai Li
{"title":"Entropy Estimation via Uniformization","authors":"Ziqiao Ao, Jinglai Li","doi":"10.48550/arXiv.2304.09700","DOIUrl":"https://doi.org/10.48550/arXiv.2304.09700","url":null,"abstract":"Entropy estimation is of practical importance in information theory and statistical science. Many existing entropy estimators suffer from fast growing estimation bias with respect to dimensionality, rendering them unsuitable for high-dimensional problems. In this work we propose a transform-based method for high-dimensional entropy estimation, which consists of the following two main ingredients. First by modifying the k-NN based entropy estimator, we propose a new estimator which enjoys small estimation bias for samples that are close to a uniform distribution. Second we design a normalizing flow based mapping that pushes samples toward a uniform distribution, and the relation between the entropy of the original samples and the transformed ones is also derived. As a result the entropy of a given set of samples is estimated by first transforming them toward a uniform distribution and then applying the proposed estimator to the transformed samples. The performance of the proposed method is compared against several existing entropy estimators, with both mathematical examples and real-world applications.","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"23 1","pages":"103954"},"PeriodicalIF":0.0,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84791431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Task-Guided IRL in POMDPs that Scales 任务导向IRL在pomdp中的扩展
Artif. Intell. Pub Date : 2022-12-30 DOI: 10.48550/arXiv.2301.01219
Franck Djeumou, Christian Ellis, Murat Cubuktepe, Craig T. Lennon, U. Topcu
{"title":"Task-Guided IRL in POMDPs that Scales","authors":"Franck Djeumou, Christian Ellis, Murat Cubuktepe, Craig T. Lennon, U. Topcu","doi":"10.48550/arXiv.2301.01219","DOIUrl":"https://doi.org/10.48550/arXiv.2301.01219","url":null,"abstract":"In inverse reinforcement learning (IRL), a learning agent infers a reward function encoding the underlying task using demonstrations from experts. However, many existing IRL techniques make the often unrealistic assumption that the agent has access to full information about the environment. We remove this assumption by developing an algorithm for IRL in partially observable Markov decision processes (POMDPs). We address two limitations of existing IRL techniques. First, they require an excessive amount of data due to the information asymmetry between the expert and the learner. Second, most of these IRL techniques require solving the computationally intractable forward problem -- computing an optimal policy given a reward function -- in POMDPs. The developed algorithm reduces the information asymmetry while increasing the data efficiency by incorporating task specifications expressed in temporal logic into IRL. Such specifications may be interpreted as side information available to the learner a priori in addition to the demonstrations. Further, the algorithm avoids a common source of algorithmic complexity by building on causal entropy as the measure of the likelihood of the demonstrations as opposed to entropy. Nevertheless, the resulting problem is nonconvex due to the so-called forward problem. We solve the intrinsic nonconvexity of the forward problem in a scalable manner through a sequential linear programming scheme that guarantees to converge to a locally optimal policy. In a series of examples, including experiments in a high-fidelity Unity simulator, we demonstrate that even with a limited amount of data and POMDPs with tens of thousands of states, our algorithm learns reward functions and policies that satisfy the task while inducing similar behavior to the expert by leveraging the provided side information.","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"46 1","pages":"103856"},"PeriodicalIF":0.0,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80005271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Defense coordination in security games: Equilibrium analysis and mechanism design 安全博弈中的防御协调:均衡分析与机制设计
Artif. Intell. Pub Date : 2022-09-01 DOI: 10.1016/j.artint.2022.103791
Jiarui Gan, E. Elkind, Sarit Kraus, M. Wooldridge
{"title":"Defense coordination in security games: Equilibrium analysis and mechanism design","authors":"Jiarui Gan, E. Elkind, Sarit Kraus, M. Wooldridge","doi":"10.1016/j.artint.2022.103791","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103791","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"48 1","pages":"103791"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81771618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Measuring power in coalitional games with friends, enemies and allies 在与朋友、敌人和盟友的联盟游戏中衡量力量
Artif. Intell. Pub Date : 2022-09-01 DOI: 10.1016/j.artint.2022.103792
Oskar Skibski, Takamasa Suzuki, Tomasz Grabowski, Y. Sakurai, Tomasz P. Michalak, M. Yokoo
{"title":"Measuring power in coalitional games with friends, enemies and allies","authors":"Oskar Skibski, Takamasa Suzuki, Tomasz Grabowski, Y. Sakurai, Tomasz P. Michalak, M. Yokoo","doi":"10.1016/j.artint.2022.103792","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103792","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"42 1","pages":"103792"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81025926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reasoning about general preference relations 关于一般偏好关系的推理
Artif. Intell. Pub Date : 2022-09-01 DOI: 10.1016/j.artint.2022.103793
Davide Grossi, W. van der Hoek, Louwe B. Kuijer
{"title":"Reasoning about general preference relations","authors":"Davide Grossi, W. van der Hoek, Louwe B. Kuijer","doi":"10.1016/j.artint.2022.103793","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103793","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"31 1","pages":"103793"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84190819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Discovering Agents 发现代理
Artif. Intell. Pub Date : 2022-08-17 DOI: 10.48550/arXiv.2208.08345
Z. Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan G. Richens, Matt MacDermott, Tom Everitt
{"title":"Discovering Agents","authors":"Z. Kenton, Ramana Kumar, Sebastian Farquhar, Jonathan G. Richens, Matt MacDermott, Tom Everitt","doi":"10.48550/arXiv.2208.08345","DOIUrl":"https://doi.org/10.48550/arXiv.2208.08345","url":null,"abstract":"Causal models of agents have been used to analyse the safety aspects of machine learning systems. But identifying agents is non-trivial -- often the causal model is just assumed by the modeler without much justification -- and modelling failures can lead to mistakes in the safety analysis. This paper proposes the first formal causal definition of agents -- roughly that agents are systems that would adapt their policy if their actions influenced the world in a different way. From this we derive the first causal discovery algorithm for discovering agents from empirical data, and give algorithms for translating between causal models and game-theoretic influence diagrams. We demonstrate our approach by resolving some previous confusions caused by incorrect causal modelling of agents.","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"63 1","pages":"103963"},"PeriodicalIF":0.0,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77130229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Simplified Risk-aware Decision Making with Belief-dependent Rewards in Partially Observable Domains 部分可观察域中具有信任依赖奖励的简化风险意识决策
Artif. Intell. Pub Date : 2022-08-01 DOI: 10.1016/j.artint.2022.103775
A. Zhitnikov, V. Indelman
{"title":"Simplified Risk-aware Decision Making with Belief-dependent Rewards in Partially Observable Domains","authors":"A. Zhitnikov, V. Indelman","doi":"10.1016/j.artint.2022.103775","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103775","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"100 1","pages":"103775"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73670195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Q-Learning-based model predictive variable impedance control for physical human-robot collaboration 基于q - learning的模型预测变阻抗控制在物理人机协作中的应用
Artif. Intell. Pub Date : 2022-08-01 DOI: 10.1016/j.artint.2022.103771
L. Roveda, Andrea Testa, Asad Ali Shahid, F. Braghin, D. Piga
{"title":"Q-Learning-based model predictive variable impedance control for physical human-robot collaboration","authors":"L. Roveda, Andrea Testa, Asad Ali Shahid, F. Braghin, D. Piga","doi":"10.1016/j.artint.2022.103771","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103771","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"58 1","pages":"103771"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91345015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Safe, Learning-Based MPC for Highway Driving under Lane-Change Uncertainty: A Distributionally Robust Approach 车道变化不确定性下的安全、基于学习的MPC高速公路驾驶:一种分布鲁棒方法
Artif. Intell. Pub Date : 2022-06-27 DOI: 10.48550/arXiv.2206.13319
Mathijs Schuurmans, Alexander Katriniok, Chris Meissen, H. E. Tseng, Panagiotis Patrinos
{"title":"Safe, Learning-Based MPC for Highway Driving under Lane-Change Uncertainty: A Distributionally Robust Approach","authors":"Mathijs Schuurmans, Alexander Katriniok, Chris Meissen, H. E. Tseng, Panagiotis Patrinos","doi":"10.48550/arXiv.2206.13319","DOIUrl":"https://doi.org/10.48550/arXiv.2206.13319","url":null,"abstract":"We present a case study applying learning-based distributionally robust model predictive control to highway motion planning under stochastic uncertainty of the lane change behavior of surrounding road users. The dynamics of road users are modelled using Markov jump systems, in which the switching variable describes the desired lane of the vehicle under consideration and the continuous state describes the pose and velocity of the vehicles. We assume the switching probabilities of the underlying Markov chain to be unknown. As the vehicle is observed and thus, samples from the Markov chain are drawn, the transition probabilities are estimated along with an ambiguity set which accounts for misestimations of these probabilities. Correspondingly, a distributionally robust optimal control problem is formulated over a scenario tree, and solved in receding horizon. As a result, a motion planning procedure is obtained which through observation of the target vehicle gradually becomes less conservative while avoiding overconfidence in estimates obtained from small sample sizes. We present an extensive numerical case study, comparing the effects of several different design aspects on the controller performance and safety.","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"25 1","pages":"103920"},"PeriodicalIF":0.0,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78878269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A polynomial reduction of forks into logic programs 将分叉多项式化简成逻辑程序
Artif. Intell. Pub Date : 2022-03-01 DOI: 10.1016/j.artint.2022.103712
Felicidad Aguado, Pedro Cabalar, Jorge Fandinno, D. Pearce, Gilberto Pérez, Concepción Vidal Martín
{"title":"A polynomial reduction of forks into logic programs","authors":"Felicidad Aguado, Pedro Cabalar, Jorge Fandinno, D. Pearce, Gilberto Pérez, Concepción Vidal Martín","doi":"10.1016/j.artint.2022.103712","DOIUrl":"https://doi.org/10.1016/j.artint.2022.103712","url":null,"abstract":"","PeriodicalId":8496,"journal":{"name":"Artif. Intell.","volume":"3 1","pages":"103712"},"PeriodicalIF":0.0,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88860666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信