Artificial Intelligence最新文献

筛选
英文 中文
Adversarial analysis of similarity-based sign prediction 基于相似性的符号预测的对抗分析
IF 5.1 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-27 DOI: 10.1016/j.artint.2024.104173
Michał T. Godziszewski , Marcin Waniek , Yulin Zhu , Kai Zhou , Talal Rahwan , Tomasz P. Michalak
{"title":"Adversarial analysis of similarity-based sign prediction","authors":"Michał T. Godziszewski ,&nbsp;Marcin Waniek ,&nbsp;Yulin Zhu ,&nbsp;Kai Zhou ,&nbsp;Talal Rahwan ,&nbsp;Tomasz P. Michalak","doi":"10.1016/j.artint.2024.104173","DOIUrl":"10.1016/j.artint.2024.104173","url":null,"abstract":"<div><p>Adversarial social network analysis explores how social links can be altered or otherwise manipulated to hinder unwanted information collection. To date, however, problems of this kind have not been studied in the context of signed networks in which links have positive and negative labels. Such formalism is often used to model social networks with positive links indicating friendship or support and negative links indicating antagonism or opposition.</p><p>In this work, we present a computational analysis of the problem of attacking sign prediction in signed networks, whereby the aim of the attacker (a network member) is to hide from the defender (an analyst) the signs of a target set of links by removing the signs of some other, non-target, links. While the problem turns out to be NP-hard if either local or global similarity measures are used for sign prediction, we provide a number of positive computational results, including an FPT-algorithm for eliminating common signed neighborhood and heuristic algorithms for evading local similarity-based link prediction in signed networks.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"335 ","pages":"Article 104173"},"PeriodicalIF":5.1,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141638643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyper-heuristics for personnel scheduling domains 人员调度领域的超启发式方法
IF 5.1 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-25 DOI: 10.1016/j.artint.2024.104172
{"title":"Hyper-heuristics for personnel scheduling domains","authors":"","doi":"10.1016/j.artint.2024.104172","DOIUrl":"10.1016/j.artint.2024.104172","url":null,"abstract":"<div><p>In real-life applications problems can frequently change or require small adaptations. Manually creating and tuning algorithms for different problem domains or different versions of a problem can be cumbersome and time-consuming. In this paper we consider several important problems with high practical relevance, which are Rotating Workforce Scheduling, Minimum Shift Design, and Bus Driver Scheduling. Instead of designing very specific solution methods, we propose to use the more general approach based on hyper-heuristics which take a set of simpler low-level heuristics and combine them to automatically create a fitting heuristic for the problem at hand. This paper presents a major study on applying hyper-heuristics to these domains, which contributes in four different ways: First, it defines new low-level heuristics for these scheduling domains, allowing to apply hyper-heuristics to them for the first time. Second, it provides a comparison of several state-of-the-art hyper-heuristics on those domains. Third, new best solutions for several instances of the different problem domains are found. Finally, a detailed investigation of the use of low-level heuristics by the hyper-heuristics gives insights in the way hyper-heuristics apply to different domains and the importance of different low-level heuristics. The results show that hyper-heuristics are able to perform well even on very complex practical problem domains in the area of scheduling and, while being more general and requiring less problem-specific adaptation, can in several cases compete with specialized algorithms for the specific problems. Several hyper-heuristics with very good performance across different real-life domains are identified. They can efficiently select low-level heuristics to apply for each domain, but for repeated application they benefit from evaluating and selecting the most useful subset of these heuristics. These results help to improve industrial systems in use for solving different scheduling scenarios by allowing faster and easier adaptation to new problem variants.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104172"},"PeriodicalIF":5.1,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001085/pdfft?md5=4b18a79ac0a3f1adc46a5f873b25eac7&pid=1-s2.0-S0004370224001085-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boosting optimal symbolic planning: Operator-potential heuristics 提升最佳符号规划:运算器潜能启发式
IF 5.1 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-21 DOI: 10.1016/j.artint.2024.104174
{"title":"Boosting optimal symbolic planning: Operator-potential heuristics","authors":"","doi":"10.1016/j.artint.2024.104174","DOIUrl":"10.1016/j.artint.2024.104174","url":null,"abstract":"<div><p>Heuristic search guides the exploration of states via heuristic functions <em>h</em> estimating remaining cost. Symbolic search instead replaces the exploration of individual states with that of state sets, compactly represented using binary decision diagrams (BDDs). In cost-optimal planning, heuristic explicit search performs best overall, but symbolic search performs best in many individual domains, so both approaches together constitute the state of the art. Yet combinations of the two have so far not been an unqualified success, because (i) <em>h</em> must be applicable to sets of states rather than individual ones, and (ii) the different state partitioning induced by <em>h</em> may be detrimental for BDD size. Many competitive heuristic functions in planning do not qualify for (i), and it has been shown that even extremely informed heuristics can deteriorate search performance due to (ii).</p><p>Here we show how to achieve (i) for a state-of-the-art family of heuristic functions, namely potential heuristics. These assign a fixed potential value to each state-variable/value pair, ensuring by LP constraints that the sum over these values, for any state, yields an admissible and consistent heuristic function. Our key observation is that we can express potential heuristics through fixed potential values for operators instead, capturing the change of heuristic value induced by each operator. These reformulated heuristics satisfy (i) because we can express the heuristic value change as part of the BDD transition relation in symbolic search steps. We run exhaustive experiments on IPC benchmarks, evaluating several different instantiations of potential heuristics in forward, backward, and bi-directional symbolic search. Our operator-potential heuristics turn out to be highly beneficial, in particular they hardly ever suffer from (ii). Our best configurations soundly beat previous optimal symbolic planning algorithms, bringing them on par with the state of the art in optimal heuristic explicit search planning in overall performance.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104174"},"PeriodicalIF":5.1,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001103/pdfft?md5=e96bd05f57c63e29d7f7ad8ddd65c0e0&pid=1-s2.0-S0004370224001103-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141574904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Delegated online search 委托在线搜索
IF 5.1 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-20 DOI: 10.1016/j.artint.2024.104171
Pirmin Braun , Niklas Hahn , Martin Hoefer , Conrad Schecker
{"title":"Delegated online search","authors":"Pirmin Braun ,&nbsp;Niklas Hahn ,&nbsp;Martin Hoefer ,&nbsp;Conrad Schecker","doi":"10.1016/j.artint.2024.104171","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104171","url":null,"abstract":"<div><p>In a delegation problem, a <em>principal</em> <span><math><mi>P</mi></math></span> with commitment power tries to pick one out of <em>n</em> options. Each option is drawn independently from a known distribution. Instead of inspecting the options herself, <span><math><mi>P</mi></math></span> delegates the information acquisition to a rational and self-interested <em>agent</em> <span><math><mi>A</mi></math></span>. After inspection, <span><math><mi>A</mi></math></span> proposes one of the options, and <span><math><mi>P</mi></math></span> can accept or reject.</p><p>Delegation is a classic setting in economic information design with many prominent applications, but the computational problems are only poorly understood. In this paper, we study a natural <em>online</em> variant of delegation, in which the agent searches through the options in an online fashion. For each option, he has to irrevocably decide if he wants to propose the current option or discard it, before seeing information on the next option(s). How can we design algorithms for <span><math><mi>P</mi></math></span> that approximate the utility of her best option in hindsight?</p><p>We show that in general <span><math><mi>P</mi></math></span> can obtain a <span><math><mi>Θ</mi><mo>(</mo><mn>1</mn><mo>/</mo><mi>n</mi><mo>)</mo></math></span>-approximation and extend this result to ratios of <span><math><mi>Θ</mi><mo>(</mo><mi>k</mi><mo>/</mo><mi>n</mi><mo>)</mo></math></span> in case (1) <span><math><mi>A</mi></math></span> has a lookahead of <em>k</em> rounds, or (2) <span><math><mi>A</mi></math></span> can propose up to <em>k</em> different options. We provide fine-grained bounds independent of <em>n</em> based on three parameters. If the ratio of maximum and minimum utility for <span><math><mi>A</mi></math></span> is bounded by a factor <em>α</em>, we obtain an <span><math><mi>Ω</mi><mo>(</mo><mi>log</mi><mo>⁡</mo><mi>log</mi><mo>⁡</mo><mi>α</mi><mo>/</mo><mi>log</mi><mo>⁡</mo><mi>α</mi><mo>)</mo></math></span>-approximation algorithm, and we show that this is best possible. Additionally, if <span><math><mi>P</mi></math></span> cannot distinguish options with the same value for herself, we show that ratios polynomial in <span><math><mn>1</mn><mo>/</mo><mi>α</mi></math></span> cannot be avoided. If there are at most <em>β</em> different utility values for <span><math><mi>A</mi></math></span>, we show a <span><math><mi>Θ</mi><mo>(</mo><mn>1</mn><mo>/</mo><mi>β</mi><mo>)</mo></math></span>-approximation. If the utilities of <span><math><mi>P</mi></math></span> and <span><math><mi>A</mi></math></span> for each option are related by a factor <em>γ</em>, we obtain an <span><math><mi>Ω</mi><mo>(</mo><mn>1</mn><mo>/</mo><mi>log</mi><mo>⁡</mo><mi>γ</mi><mo>)</mo></math></span>-approximation, where <span><math><mi>O</mi><mo>(</mo><mi>log</mi><mo>⁡</mo><mi>log</mi><mo>⁡</mo><mi>γ</mi><mo>/</mo><mi>log</mi><mo>⁡</mo><mi>γ</mi><mo>)</mo></math></span> is best possible.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104171"},"PeriodicalIF":5.1,"publicationDate":"2024-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001073/pdfft?md5=2d7a00808c733af9db17db5a21fc73fe&pid=1-s2.0-S0004370224001073-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An extensive study of security games with strategic informants 对有战略线人的安全博弈的广泛研究
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-12 DOI: 10.1016/j.artint.2024.104162
Weiran Shen , Minbiao Han , Weizhe Chen , Taoan Huang , Rohit Singh , Haifeng Xu , Fei Fang
{"title":"An extensive study of security games with strategic informants","authors":"Weiran Shen ,&nbsp;Minbiao Han ,&nbsp;Weizhe Chen ,&nbsp;Taoan Huang ,&nbsp;Rohit Singh ,&nbsp;Haifeng Xu ,&nbsp;Fei Fang","doi":"10.1016/j.artint.2024.104162","DOIUrl":"10.1016/j.artint.2024.104162","url":null,"abstract":"<div><p>Over the past years, game-theoretic modeling for security and public safety issues (also known as <em>security games</em>) have attracted intensive research attention and have been successfully deployed in many real-world applications for fighting, e.g., illegal poaching, fishing and urban crimes. However, few existing works consider how information from local communities would affect the structure of these games. In this paper, we systematically investigate how a new type of players – <em>strategic informants</em> who are from local communities and may observe and report upcoming attacks – affects the classic defender-attacker security interactions. Characterized by a private type, each informant has a utility structure that drives their strategic behaviors.</p><p>For situations with a single informant, we capture the problem as a 3-player extensive-form game and develop a novel solution concept, Strong Stackelberg-perfect Bayesian equilibrium, for the game. To find an optimal defender strategy, we establish that though the informant can have infinitely many types in general, there always exists an optimal defense plan using only a linear number of patrol strategies; this succinct characterization then enables us to efficiently solve the game via linear programming. For situations with multiple informants, we show that there is also an optimal defense plan with only a linear number of patrol strategies that admits a simple structure based on plurality voting among multiple informants.</p><p>Finally, we conduct extensive experiments to study the effect of the strategic informants and demonstrate the efficiency of our algorithm. Our experiments show that the existence of such informants significantly increases the defender's utility. Even though the informants exhibit strategic behaviors, the information they supply holds great value as defensive resources. Compared to existing works, our study leads to a deeper understanding on the role of informants in such defender-attacker interactions.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104162"},"PeriodicalIF":14.4,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141410791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A domain-independent agent architecture for adaptive operation in evolving open worlds 在不断进化的开放世界中实现自适应运行的独立于领域的代理架构
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-06 DOI: 10.1016/j.artint.2024.104161
Shiwali Mohan , Wiktor Piotrowski , Roni Stern , Sachin Grover , Sookyung Kim , Jacob Le , Yoni Sher , Johan de Kleer
{"title":"A domain-independent agent architecture for adaptive operation in evolving open worlds","authors":"Shiwali Mohan ,&nbsp;Wiktor Piotrowski ,&nbsp;Roni Stern ,&nbsp;Sachin Grover ,&nbsp;Sookyung Kim ,&nbsp;Jacob Le ,&nbsp;Yoni Sher ,&nbsp;Johan de Kleer","doi":"10.1016/j.artint.2024.104161","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104161","url":null,"abstract":"<div><p> Model-based reasoning agents are ill-equipped to act in novel situations in which their model of the environment no longer sufficiently represents the world. We propose HYDRA, a framework for designing model-based agents operating in mixed discrete-continuous worlds that can autonomously detect when the environment has evolved from its canonical setup, understand how it has evolved, and adapt the agents' models to perform effectively. HYDRA is based upon PDDL+, a rich modeling language for planning in mixed, discrete-continuous environments. It augments the planning module with visual reasoning, task selection, and action execution modules for closed-loop interaction with complex environments. HYDRA implements a novel meta-reasoning process that enables the agent to monitor its own behavior from a variety of aspects. The process employs a diverse set of computational methods to maintain expectations about the agent's own behavior in an environment. Divergences from those expectations are useful in detecting when the environment has evolved and identifying opportunities to adapt the underlying models. HYDRA builds upon ideas from diagnosis and repair and uses a heuristics-guided search over model changes such that they become competent in novel conditions. The HYDRA framework has been used to implement <em>novelty-aware</em> agents for three diverse domains - CartPole++ (a higher dimension variant of a classic control problem), Science Birds (an IJCAI competition problem<span><sup>1</sup></span>), and PogoStick (a specific problem domain in Minecraft). We report empirical observations from these domains to demonstrate the efficacy of various components in the novelty meta-reasoning process.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104161"},"PeriodicalIF":14.4,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141303042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional Relation Field: A Model-Agnostic Framework for Multivariate Time Series Forecasting 功能关系场:多变量时间序列预测的模型诊断框架
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-06-05 DOI: 10.1016/j.artint.2024.104158
Ting Li , Bing Yu , Jianguo Li , Zhanxing Zhu
{"title":"Functional Relation Field: A Model-Agnostic Framework for Multivariate Time Series Forecasting","authors":"Ting Li ,&nbsp;Bing Yu ,&nbsp;Jianguo Li ,&nbsp;Zhanxing Zhu","doi":"10.1016/j.artint.2024.104158","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104158","url":null,"abstract":"<div><p>In multivariate time series forecasting, the most popular strategy for modeling the relationship between multiple time series is the construction of graph, where each time series is represented as a node and related nodes are connected by edges. However, the relationship between multiple time series is typically complicated, e.g. the sum of outflows from upstream nodes may be equal to the inflows of downstream nodes. Such relations widely exist in many real-world scenarios for multivariate time series forecasting, yet are far from well studied. In these cases, graph might be insufficient for modeling the complex dependency between nodes. To this end, we explore a new framework to model the inter-node relationship in a more precise way based our proposed inductive bias, <em>Functional Relation Field</em>, where a group of functions parameterized by neural networks are learned to characterize the dependency between multiple time series. Essentially, these learned functions then form a “field”, i.e. a particular set of constraints, to regularize the training loss of the backbone prediction network and enforce the inference process to satisfy these constraints. Since our framework introduces the relationship bias in a data-driven manner, it is flexible and model-agnostic such that it can be applied to any existing multivariate time series prediction networks for boosting performance. The experiment is conducted on one toy dataset to show our approach can well recover the true constraint relationship between nodes. And various real-world datasets are also considered with different backbone prediction networks. Results show that the prediction error can be reduced remarkably with the aid of the proposed framework.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104158"},"PeriodicalIF":14.4,"publicationDate":"2024-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000948/pdfft?md5=1e0e8c2dca5cc80e5c38837feded9d5f&pid=1-s2.0-S0004370224000948-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141308373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stability based on single-agent deviations in additively separable hedonic games 基于可加可分对冲博弈中单代理偏差的稳定性
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-05-31 DOI: 10.1016/j.artint.2024.104160
Felix Brandt , Martin Bullinger , Leo Tappe
{"title":"Stability based on single-agent deviations in additively separable hedonic games","authors":"Felix Brandt ,&nbsp;Martin Bullinger ,&nbsp;Leo Tappe","doi":"10.1016/j.artint.2024.104160","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104160","url":null,"abstract":"<div><p>Coalition formation is a central concern in multiagent systems. A common desideratum for coalition structures is stability, defined by the absence of beneficial deviations of single agents. Such deviations require an agent to improve her utility by joining another coalition. On top of that, the feasibility of deviations may also be restricted by demanding consent of agents in the welcoming and/or the abandoned coalition. While most of the literature focuses on deviations constrained by unanimous consent, we also study consent decided by majority vote and introduce two new stability notions that can be seen as local variants of another solution concept called popularity. We investigate stability in additively separable hedonic games by pinpointing boundaries to computational complexity depending on the type of consent and friend-oriented utility restrictions. The latter restrictions shed new light on well-studied classes of games based on the appreciation of friends or the aversion to enemies. Many of our positive results follow from a new combinatorial observation that we call the <em>Deviation Lemma</em> and that we leverage to prove the convergence of simple and natural single-agent dynamics under fairly general conditions. Our negative results, in particular, resolve the complexity of contractual Nash stability in additively separable hedonic games.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104160"},"PeriodicalIF":14.4,"publicationDate":"2024-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000961/pdfft?md5=e987438fd09ba66fd8cb7e8db197482a&pid=1-s2.0-S0004370224000961-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141244960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint learning of reward machines and policies in environments with partially known semantics 在部分已知语义的环境中联合学习奖赏机和策略
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-05-23 DOI: 10.1016/j.artint.2024.104146
Christos K. Verginis , Cevahir Koprulu , Sandeep Chinchali , Ufuk Topcu
{"title":"Joint learning of reward machines and policies in environments with partially known semantics","authors":"Christos K. Verginis ,&nbsp;Cevahir Koprulu ,&nbsp;Sandeep Chinchali ,&nbsp;Ufuk Topcu","doi":"10.1016/j.artint.2024.104146","DOIUrl":"10.1016/j.artint.2024.104146","url":null,"abstract":"<div><p>We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions' truth values. In order to address such uncertainties, the algorithm maintains a probabilistic estimate about the truth value of the atomic propositions; it updates this estimate according to new sensory measurements that arrive from exploration of the environment. Additionally, the algorithm maintains a hypothesis reward machine, which acts as an estimate of the reward machine that encodes the task to be learned. As the agent explores the environment, the algorithm updates the hypothesis reward machine according to the obtained rewards and the estimate of the atomic propositions' truth value. Finally, the algorithm uses a Q-learning procedure for the states of the hypothesis reward machine to determine an optimal policy that accomplishes the task. We prove that the algorithm successfully infers the reward machine and asymptotically learns a policy that accomplishes the respective task.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104146"},"PeriodicalIF":14.4,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000821/pdfft?md5=00403f012b025daac195daf945ec2715&pid=1-s2.0-S0004370224000821-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141178018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Credulous acceptance in high-order argumentation frameworks with necessities: An incremental approach 有必然性的高阶论证框架中的可信接受:渐进方法
IF 14.4 2区 计算机科学
Artificial Intelligence Pub Date : 2024-05-22 DOI: 10.1016/j.artint.2024.104159
Gianvincenzo Alfano , Andrea Cohen , Sebastian Gottifredi , Sergio Greco , Francesco Parisi , Guillermo R. Simari
{"title":"Credulous acceptance in high-order argumentation frameworks with necessities: An incremental approach","authors":"Gianvincenzo Alfano ,&nbsp;Andrea Cohen ,&nbsp;Sebastian Gottifredi ,&nbsp;Sergio Greco ,&nbsp;Francesco Parisi ,&nbsp;Guillermo R. Simari","doi":"10.1016/j.artint.2024.104159","DOIUrl":"10.1016/j.artint.2024.104159","url":null,"abstract":"<div><p>Argumentation is an important research area in the field of AI. There is a substantial amount of work on different aspects of Dung's abstract Argumentation Framework (AF). Two relevant aspects considered separately so far are: <em>i</em>) extending the framework to account for recursive attacks and supports, and <span><math><mi>i</mi><mi>i</mi><mo>)</mo></math></span> considering dynamics, <em>i.e.</em>, AFs evolving over time. In this paper, we jointly deal with these two aspects. We focus on High-Order Argumentation Frameworks with Necessities (HOAFNs) which allow for attack and support relations (interpreted as <em>necessity</em>) not only between arguments but also targeting attacks and supports at any level. We propose an approach for the incremental evaluation of the credulous acceptance problem in HOAFNs, by “incrementally” computing an extension (a set of accepted arguments, attacks and supports), if it exists, containing a given goal element in an updated HOAFN. In particular, we are interested in monitoring the credulous acceptance of a given argument, attack or support (goal) in an evolving HOAFN. Thus, our approach assumes to have a HOAFN Δ, a goal <em>ϱ</em> occurring in Δ, an extension <em>E</em> for Δ containing <em>ϱ</em>, and an update <em>u</em> establishing some changes in the original HOAFN, and uses the extension for first checking whether the update is relevant; for relevant updates, an extension of the updated HOAFN containing the goal is computed by translating the problem to the AF domain and leveraging on AF solvers. We provide formal results for our incremental approach and empirically show that it outperforms the evaluation from scratch of the credulous acceptance problem for an updated HOAFN.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104159"},"PeriodicalIF":14.4,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信