Alexandru Baltag , Nick Bezhanishvili , David Fernández-Duque
{"title":"The topology of surprise","authors":"Alexandru Baltag , Nick Bezhanishvili , David Fernández-Duque","doi":"10.1016/j.artint.2025.104423","DOIUrl":"10.1016/j.artint.2025.104423","url":null,"abstract":"<div><div>In this paper we present a topological epistemic logic, with modalities for knowledge (modelled as the universal modality), knowability (represented by the topological interior operator), and unknowability of the actual world. The last notion has a non-self-referential reading (modelled by Cantor derivative: the set of limit points of a given set) and a self-referential one (modelled by Cantor's perfect core of a given set: its largest subset without isolated points, where <em>x</em> is isolated iff <span><math><mo>{</mo><mi>x</mi><mo>}</mo></math></span> is open). We completely axiomatize this logic, showing that it is decidable and <span>pspace</span>-complete, and we apply it to the analysis of a famous epistemic puzzle: the Surprise Exam Paradox.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"349 ","pages":"Article 104423"},"PeriodicalIF":4.6,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145189732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learngene: Inheritable “genes” in intelligent agents","authors":"Fu Feng , Jing Wang , Xu Yang , Xin Geng","doi":"10.1016/j.artint.2025.104421","DOIUrl":"10.1016/j.artint.2025.104421","url":null,"abstract":"<div><div>Biological intelligence has driven significant progress in artificial intelligence (AI), but a critical gap remains: biological systems inherit innate abilities from genes, with brains initialized by blueprints refined over 3.5 billion years of evolution, while machines rely heavily on inefficient, data-driven learning from scratch. This gap arises from the lack of a genetic mechanism in machines to transfer and accumulate inheritable knowledge across generations. To bridge this gap, we propose learngenes, network fragments that act as inheritable “genes” for machines. Unlike conventional knowledge transfer methods, learngenes enable efficient and universal knowledge transfer by selectively encapsulating task-agnostic knowledge. To facilitate the transfer and accumulation of task-agnostic knowledge across generations, we introduce Genetic Reinforcement Learning (GRL), a framework that simulates the learning and evolution of organisms in intelligent agents following Lamarckian principles. Through GRL, we identify learngenes as network fragments within agents' policy networks, equipping newborn agents with innate abilities for rapid adaptation to novel tasks. We demonstrate the advantages of learngene-based knowledge transfer over evolution-based search and traditional pre-trained models, and show how learngenes evolve through the accumulation of task-agnostic knowledge. Overall, this work establishes a novel paradigm for knowledge transfer and model initialization in AI, offering new possibilities for more adaptive, efficient, and scalable learning systems.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104421"},"PeriodicalIF":4.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145154766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised sentence selection for creating a representative corpus in Turkish: An active learning approach","authors":"Hayri Volkan Agun","doi":"10.1016/j.artint.2025.104422","DOIUrl":"10.1016/j.artint.2025.104422","url":null,"abstract":"<div><div>In this study, active learning methods adapted for sentence selection of Turkish sentences are evaluated through language learning with neural models. Turkish is an agglutinative language with a complex morphology, where the linguistic properties of words are encoded in suffixes. The active learning methods based on regression, clustering, language models, distance metrics, and neural networks are applied to unlabeled sentence selection. In this respect, a sentence corpus is selected from a larger corpus, with the same number of samples for each target word in intrinsic and extrinsic evaluation tasks. The selected sentences are used for the training of SkipGram, CBOW, and self-attention LSTM language models and extracted embeddings are evaluated by the semantic analogy, POS and sentiment analysis tasks. The evaluation scores of the models trained on the samples selected by the active learning method are compared. The results of the selected sentences based on language models indicate an improvement over random selection based on a static vocabulary. These results also show that the selection affects the quality of unsupervised word embedding extraction even if the target vocabulary is kept the same. Along with the accuracy, the time efficiency of the language models is shown to be better than other methods especially methods based on neural network models, and distance metrics.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104422"},"PeriodicalIF":4.6,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145104222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging theory and practice in bidirectional heuristic search with front-to-end consistent heuristics","authors":"Lior Siag, Shahaf S. Shperberg","doi":"10.1016/j.artint.2025.104420","DOIUrl":"10.1016/j.artint.2025.104420","url":null,"abstract":"<div><div>Recent research on bidirectional heuristic search (BiHS) has been shaped by the <em>must-expand pairs</em> (MEP) theory, which identifies the pairs of nodes that must be expanded to ensure solution optimality. Another line of research has focused on algorithms utilizing lower bounds derived from consistent heuristics during the search. This paper bridges these two approaches, offering a unified framework that demonstrates how both existing and novel algorithms can be derived from MEP theory. We introduce an extended set of bounds, encompassing both previously known and newly formulated ones. Using these bounds, we develop a range of algorithms, each employing different criteria for termination, node selection, and search direction. Finally, we empirically evaluate how these bounds and algorithms impact search efficiency.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104420"},"PeriodicalIF":4.6,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145104221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alberto Maria Metelli, Alessio Russo, Marcello Restelli
{"title":"Minimax off-policy evaluation and learning with subgaussian and differentiable importance weighting","authors":"Alberto Maria Metelli, Alessio Russo, Marcello Restelli","doi":"10.1016/j.artint.2025.104419","DOIUrl":"10.1016/j.artint.2025.104419","url":null,"abstract":"<div><div>In this work, we study the statistical properties of the <em>off-policy estimation</em> problem, i.e., estimating expectations under a target policy using samples collected from a different policy. We begin by presenting a novel minimax concentration lower bound that highlights the fundamental limits of off-policy estimation. We then analyze two well-known <em>importance weighting</em> (IW) techniques: vanilla IW and self-normalized importance weighting (SN). For both methods, we derive concentration and anti-concentration results, showing that their concentration rates are provably suboptimal compared to our lower bound. Observing that this undesired behavior arises from the <em>heavy-tailed</em> nature of the IW and SN estimators, we propose a new class of parametric estimators based on a transformation using the <em>power mean</em> (PM), which is no longer heavy-tailed. We study the theoretical properties of the PM estimator in terms of bias and variance. We show that, with suitable (possibly data-driven) tuning of its parameters, the PM estimator satisfies two key properties under certain conditions: (<em>i</em>) it achieves a <em>subgaussian</em> concentration rate that matches our lower bound and (<em>ii</em>) it maintains differentiability with respect to the target policy. Finally, we validate our approach through numerical simulations on both synthetic datasets and contextual bandits, comparing it against standard off-policy evaluation and learning baselines.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104419"},"PeriodicalIF":4.6,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the disjunctive rational closure of a conditional knowledge base","authors":"Richard Booth , Ivan Varzinczak","doi":"10.1016/j.artint.2025.104418","DOIUrl":"10.1016/j.artint.2025.104418","url":null,"abstract":"<div><div>One of the most widely investigated decision problems in symbolic AI is that of which conditional sentences of the form “if <em>α</em>, then normally <em>β</em>” should follow from a knowledge base containing this type of statements. Probably, the most notable approach to this problem is the rational closure construction put forward by Lehmann and Magidor in the'90s, which has been adapted to logical languages of various expressive powers since then. At the core of rational closure is the Rational Monotonicity property, which allows one to retain existing (defeasible) conclusions whenever new information cannot be negated by existing conclusions. As it turns out, Rational Monotonicity is not universally accepted, with many researchers advocating the investigation of weaker versions thereof leading to a larger class of consequence relations. A case in point is that of the Disjunctive Rationality property, which states that if one may draw a (defeasible) conclusion from a disjunction of premises, then one should be able to draw this conclusion from at least one of the premises taken alone. While there are convincing arguments that the rational closure forms the ‘simplest’ rational consequence relation extending a given set of conditionals, the question of what the simplest disjunctive consequence relation in this setting is has not been explored in depth. In this article, we do precisely that by motivating and proposing a concrete construction of the disjunctive rational closure of a conditional knowledge base, of which the properties and consequences of its adoption we also investigate in detail. (Previous versions of this work have been selected for presentation at the 18th International Workshop on Nonmonotonic Reasoning (NMR 2020) <span><span>[1]</span></span> and at the 35th AAAI Conference on Artificial Intelligence (AAAI 2021) <span><span>[2]</span></span>. The present submission extends and elaborates on both papers.)</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104418"},"PeriodicalIF":4.6,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145094874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ning Liao , Bowen Shi , Xiaopeng Zhang , Min Cao , Junchi Yan , Qi Tian
{"title":"Rethinking visual prompt learning as masked visual token modeling","authors":"Ning Liao , Bowen Shi , Xiaopeng Zhang , Min Cao , Junchi Yan , Qi Tian","doi":"10.1016/j.artint.2025.104417","DOIUrl":"10.1016/j.artint.2025.104417","url":null,"abstract":"<div><div>Prompt learning has achieved great success in efficiently exploiting large-scale pre-trained models in natural language processing (NLP). It reformulates the downstream tasks as the generative pre-training ones to achieve consistency, thus improving the performance stably. However, when transferring it to the vision area, current visual prompt learning methods are almost designed on discriminative pre-trained models, and there is also a lack of careful design to unify the forms of pre-training and downstream tasks. To explore prompt learning on the generative pre-trained visual model, as well as keeping the task consistency, we propose Visual Prompt learning as masked visual Token Modeling (VPTM) to transform the downstream visual classification task into the pre-trained masked visual token prediction task. In addition, we develop the prototypical verbalizer for mapping the predicted visual token with implicit semantics to explicit downstream labels. To our best knowledge, VPTM is the first visual prompt method on the generative pre-trained visual model, which achieves consistency between pre-training and downstream visual classification by task reformulation. Experiments show that VPTM outperforms other visual prompt methods and achieves excellent efficiency. Moreover, the task consistency of VPTM contributes to the robustness against prompt location, prompt length and prototype dimension, and could be deployed uniformly.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104417"},"PeriodicalIF":4.6,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145044399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pedro P. Santos , Diogo S. Carvalho , Miguel Vasco , Alberto Sardinha , Pedro A. Santos , Ana Paiva , Francisco S. Melo
{"title":"Centralized training with hybrid execution in multi-agent reinforcement learning via predictive observation imputation","authors":"Pedro P. Santos , Diogo S. Carvalho , Miguel Vasco , Alberto Sardinha , Pedro A. Santos , Ana Paiva , Francisco S. Melo","doi":"10.1016/j.artint.2025.104404","DOIUrl":"10.1016/j.artint.2025.104404","url":null,"abstract":"<div><div>We study <em>hybrid execution</em> in multi-agent reinforcement learning (MARL), a paradigm where agents aim to complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a centralized manner, to estimate missing agents' observations at execution time. We evaluate MARO on standard scenarios and extensions of previous benchmarks tailored to emphasize the impact of partial observability in MARL. Experimental results show that our method consistently outperforms relevant baselines, allowing agents to act with faulty communication while successfully exploiting shared information.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104404"},"PeriodicalIF":4.6,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145060254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luigi Bonassi , Giuseppe De Giacomo , Marco Favorito , Francesco Fuggitti , Alfonso Emilio Gerevini , Enrico Scala
{"title":"Planning for temporally extended goals in pure-past linear temporal logic","authors":"Luigi Bonassi , Giuseppe De Giacomo , Marco Favorito , Francesco Fuggitti , Alfonso Emilio Gerevini , Enrico Scala","doi":"10.1016/j.artint.2025.104409","DOIUrl":"10.1016/j.artint.2025.104409","url":null,"abstract":"<div><div>We study planning for temporally extended goals expressed in Pure-Past Linear Temporal Logic (<span>ppltl</span>) in the context of deterministic (i.e., classical) and fully observable nondeterministic (FOND) domains. <span>ppltl</span> is the variant of Linear-time Temporal Logic on finite traces (<span>ltl</span><sub><em>f</em></sub>) that refers to the past rather than the future. Although <span>ppltl</span> is as expressive as <span>ltl</span><sub><em>f</em></sub>, we show that it is computationally much more effective for planning. In particular, we show that checking the validity of a plan for a <span>ppltl</span> formula is Markovian. This is achieved by introducing a linear number of additional propositional variables that capture the validity of the entire formula in a modular fashion. The solution encoding introduces only a linear number of new fluents proportional to the size of the <span>ppltl</span> goal and does not require any additional spurious action. We implement our solution technique in a system called <span><math><mi>Plan4Past</mi></math></span>, which can be used alongside state-of-the-art classical and FOND planners. Our empirical analysis demonstrates the practical effectiveness of <span><math><mi>Plan4Past</mi></math></span> in both classical and FOND problems, showing that the resulting planner performs overall better than other planning approaches for <span>ltl</span><sub><em>f</em></sub> goals.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104409"},"PeriodicalIF":4.6,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145044398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ryan Carey , Eric Langlois , Chris van Merwijk , Shane Legg , Tom Everitt
{"title":"Incentives for responsiveness, instrumental control and impact","authors":"Ryan Carey , Eric Langlois , Chris van Merwijk , Shane Legg , Tom Everitt","doi":"10.1016/j.artint.2025.104408","DOIUrl":"10.1016/j.artint.2025.104408","url":null,"abstract":"<div><div>We introduce three concepts that describe an agent's incentives: response incentives indicate which variables in the environment, such as sensitive demographic information, affect the decision under the optimal policy. Instrumental control incentives indicate whether an agent's policy is chosen to manipulate part of its environment, such as the preferences or instructions of a user. Impact incentives indicate which variables an agent will affect, intentionally or otherwise. For each concept, we establish sound and complete graphical criteria, and discuss general classes of techniques that may be used to produce incentives for safe and fair agent behaviour. Finally, we outline how these notions may be generalised to multi-decision settings.</div><div>This journal paper extends our conference publication “Agent Incentives: A Causal Perspective”: the material on response incentives and instrumental control incentives is updated, while the work on impact incentives and multi-decision settings is entirely new.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104408"},"PeriodicalIF":4.6,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145018406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}