Wenjing Yang , Haoang Chi , Yibing Zhan , Bowen Hu , Xiaoguang Ren , Dapeng Tao , Long Lan
{"title":"NT-FAN: A simple yet effective noise-tolerant few-shot adaptation network","authors":"Wenjing Yang , Haoang Chi , Yibing Zhan , Bowen Hu , Xiaoguang Ren , Dapeng Tao , Long Lan","doi":"10.1016/j.artint.2025.104363","DOIUrl":"10.1016/j.artint.2025.104363","url":null,"abstract":"<div><div><em>Few-shot domain adaptation</em> (FDA) aims to train a target model with <em>clean</em> labeled data from the source domain and <em>few</em> labeled data from the target domain. Given a limited annotation budget, source data may contain many noisy labels, which can detrimentally impact the performance of models in real-world applications. This problem setting is denoted as <em>wildly few-shot domain adaptation</em> (WFDA), simultaneously taking care of label noise and data shortage. While previous studies have achieved some success, they typically rely on multiple adaptation models to collaboratively filter noisy labels, resulting in substantial computational overhead. To address WFDA more simply and elegantly, we offer a theoretical analysis of this problem and propose a comprehensive upper bound for the excess risk on the target domain. Our theoretical result reveals that correct domain-invariant representations can be obtained even in the presence of source noise and limited target data without incurring additional costs. In response, we propose a simple yet effective WFDA method, referred to as <em>noise-tolerant few-shot adaptation network</em> (NT-FAN). Experiments demonstrate that our method significantly outperforms all the state-of-the-art competitors while maintaining a more <em>lightweight</em> architecture. Notably, NT-FAN consistently exhibits robust performance when dealing with more realistic and intractable source noise (e.g., instance-dependent label noise) and severe source noise (e.g., a 40% noise rate) in the source domain.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104363"},"PeriodicalIF":5.1,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A semantics for probabilistic hybrid knowledge bases with function symbols","authors":"Marco Alberti , Evelina Lamma , Fabrizio Riguzzi , Riccardo Zese","doi":"10.1016/j.artint.2025.104361","DOIUrl":"10.1016/j.artint.2025.104361","url":null,"abstract":"<div><div>Hybrid Knowledge Bases (HKBs) successfully integrate Logic Programming (LP) and Description Logics (DL) under the Minimal Knowledge with Negation as Failure semantics. Both world closure assumptions (open and closed) can be used in the same HKB, a feature required in many domains, such as the legal and health-care ones. In previous work, we proposed (function-free) Probabilistic HKBs, whose semantics applied Sato's distribution semantics approach to the well-founded HKB semantics proposed by Knorr et al. and Lyu and You. This semantics relied on the fact that the grounding of a function-free Probabilistic HKB (PHKB) is finite. In this article, we extend the PHKB language to allow function symbols, obtaining PHKB<sup><em>FS</em></sup>. Because the grounding of a PHKB<sup><em>FS</em></sup> can be infinite, we propose a novel semantics which does not require the PHKB<sup><em>FS</em></sup>'s grounding to be finite. We show that the proposed semantics extends the previously proposed semantics and that, for a large class of PHKB<sup><em>FS</em></sup>, every query can be assigned a probability.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104361"},"PeriodicalIF":5.1,"publicationDate":"2025-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144098911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yanyu Liu , Yinghui Pan , Yifeng Zeng , Biyang Ma , Prashant Doshi
{"title":"Active legibility in multiagent reinforcement learning","authors":"Yanyu Liu , Yinghui Pan , Yifeng Zeng , Biyang Ma , Prashant Doshi","doi":"10.1016/j.artint.2025.104357","DOIUrl":"10.1016/j.artint.2025.104357","url":null,"abstract":"<div><div>A multiagent sequential decision problem has been seen in many critical applications including urban transportation, autonomous driving cars, military operations, etc. Its widely known solution, namely multiagent reinforcement learning, has evolved tremendously in recent years. Among them, the solution paradigm of modeling other agents attracts our interest, which is different from traditional value decomposition or communication mechanisms. It enables agents to understand and anticipate others' behaviors and facilitates their collaboration. Inspired by recent research on the legibility that allows agents to reveal their intentions through their behavior, we propose a <em>multiagent active legibility framework</em> to improve their performance. The legibility-oriented framework drives agents to conduct legible actions so as to help others optimize their behaviors. In addition, we design a series of problem domains that emulate a common legibility-needed scenario and effectively characterize the legibility in multiagent reinforcement learning. The experimental results demonstrate that the new framework is more efficient and requires less training time compared to several multiagent reinforcement learning algorithms.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104357"},"PeriodicalIF":5.1,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144098910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pierre Baldi, Antonios Alexos, Ian Domingo, Alireza Rahmansetayesh
{"title":"A theory of synaptic neural balance: From local to global order","authors":"Pierre Baldi, Antonios Alexos, Ian Domingo, Alireza Rahmansetayesh","doi":"10.1016/j.artint.2025.104360","DOIUrl":"10.1016/j.artint.2025.104360","url":null,"abstract":"<div><div>We develop a general theory of synaptic neural balance and how it can emerge or be enforced in neural networks. For a given additive cost function <em>R</em> (regularizer), a neuron is said to be in balance if the total cost of its input weights is equal to the total cost of its output weights. The basic example is provided by feedforward networks of ReLU units trained with <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> regularizers, which exhibit balance after proper training. The theory explains this phenomenon and extends it in several directions. The first direction is the extension to bilinear and other activation functions. The second direction is the extension to more general regularizers, including all <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msub></math></span> (<span><math><mi>p</mi><mo>></mo><mn>0</mn></math></span>) regularizers. The third direction is the extension to non-layered architectures, recurrent architectures, convolutional architectures, as well as architectures with mixed activation functions and to different balancing algorithms. Gradient descent on the error function alone does not converge in general to a balanced state, where every neuron is in balance, even when starting from a balanced state. However, gradient descent on the regularized error function ought to converge to a balanced state, and thus network balance can be used to assess learning progress. The theory is based on two local neuronal operations: scaling which is commutative, and balancing which is not commutative. Finally, and most importantly, given any set of weights, when local balancing operations are applied to each neuron in a stochastic manner, global order always emerges through the convergence of the stochastic balancing algorithm to the same unique set of balanced weights. The reason for this convergence is the existence of an underlying strictly convex optimization problem where the relevant variables are constrained to a linear, only architecture-dependent, manifold. Simulations show that balancing neurons prior to learning, or during learning in alternation with gradient descent steps, can improve learning speed and performance thereby expanding the arsenal of available training tools. Scaling and balancing operations are entirely local and thus physically plausible in biological and neuromorphic neural networks.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104360"},"PeriodicalIF":5.1,"publicationDate":"2025-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144106474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Asahi Ushio, Jose Camacho-Collados, Steven Schockaert
{"title":"RelBERT: Embedding relations with language models","authors":"Asahi Ushio, Jose Camacho-Collados, Steven Schockaert","doi":"10.1016/j.artint.2025.104359","DOIUrl":"10.1016/j.artint.2025.104359","url":null,"abstract":"<div><div>Many applications need access to background knowledge about how different concepts and entities are related. Although Large Language Models (LLM) can address this need to some extent, LLMs are inefficient and difficult to control. As an alternative, we propose to extract relation embeddings from relatively small language models. In particular, we show that masked language models such as RoBERTa can be straightforwardly fine-tuned for this purpose, using only a small amount of training data. The resulting model, which we call RelBERT, captures relational similarity in a surprisingly fine-grained way, allowing us to set a new state-of-the-art in analogy benchmarks. Crucially, RelBERT is capable of modelling relations that go well beyond what the model has seen during training. For instance, we obtained strong results on relations between named entities with a model that was only trained on lexical relations between concepts, and we observed that RelBERT can recognise morphological analogies despite not being trained on such examples. Overall, we find that RelBERT significantly outperforms strategies based on prompting language models that are several orders of magnitude larger, including recent GPT-based models and open source models.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"347 ","pages":"Article 104359"},"PeriodicalIF":5.1,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144190068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CBS-Budget (CBSB): A complete and bounded suboptimal search for multi-agent path finding","authors":"Jaein Lim , Panagiotis Tsiotras","doi":"10.1016/j.artint.2025.104349","DOIUrl":"10.1016/j.artint.2025.104349","url":null,"abstract":"<div><div>Multi-Agent Path Finding (MAPF) is the problem of finding a collection of conflict-free paths for a team of multiple agents while minimizing some global cost, such as the sum of the travel time of all agents, or the travel time of the last agent. Conflict Based Search (CBS) is a leading complete and optimal MAPF algorithm that lazily explores the joint agent state space, using an admissible heuristic joint plan. Such an admissible heuristic joint plan is computed by combining individual shortest paths computed without considering inter-agent conflicts, and becoming gradually more informed as constraints are added to the individual agents' path-planning problems to avoid discovered conflicts. In this paper, we seek to speed up CBS by finding a more informed heuristic joint plan that is bounded. We first propose the budgeted Class-Ordered A* (bCOA*), a novel algorithm that finds the least-cost path with the minimal number of conflicts that is upper bounded in terms of path length. Then, we propose a novel bounded-cost variant of CBS, called CBS-Budget (CBSB) by using bCOA* search at the low-level search of the CBS and by using a modified focal search at the high-level search of the CBS. We prove that CBSB is complete and bounded-suboptimal. In our numerical experiments, CBSB finds a near-optimal solution for hundreds of agents within a fraction of a second. CBSB shows state-of-the-art performance, comparable to Explicit Estimation CBS (EECBS), an enhanced recent version of CBS. On the other hand, CBSB is much easier to implement than EECBS, since only one priority queue at the low-level search is needed, as in CBS, and only two priority queues at the high-level search are needed, as in Enhanced CBS (ECBS).</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"346 ","pages":"Article 104349"},"PeriodicalIF":5.1,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient and effective budget-feasible mechanisms for submodular valuations","authors":"Kai Han , Haotian Zhang , Shuang Cui","doi":"10.1016/j.artint.2025.104348","DOIUrl":"10.1016/j.artint.2025.104348","url":null,"abstract":"<div><div>We revisit the classical problem of designing Budget-Feasible Mechanisms (BFMs) for submodular valuation functions, which has been extensively studied since the seminal paper of Singer [FOCS'10] due to their wide applications in crowdsourcing and social marketing. We propose <span><math><mi>TripleEagle</mi></math></span>, a novel algorithmic framework for designing BFMs, based on which we present several simple yet effective BFMs that achieve better approximation ratios than the state-of-the-art work. Moreover, our BFMs are the first in the literature to achieve linear query complexity under the value oracle model while ensuring obvious strategyproofness, making them more practical than the previous BFMs. We conduct extensive experiments to evaluate the empirical performance of our BFMs, and the experimental results demonstrate the superiorities of our approach in terms of efficiency and effectiveness compared to the state-of-the-art BFMs.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"345 ","pages":"Article 104348"},"PeriodicalIF":5.1,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143921715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep optimal transport for domain adaptation on SPD manifolds","authors":"Ce Ju , Cuntai Guan","doi":"10.1016/j.artint.2025.104347","DOIUrl":"10.1016/j.artint.2025.104347","url":null,"abstract":"<div><div>Recent progress in geometric deep learning has drawn increasing attention from the machine learning community toward domain adaptation on symmetric positive definite (SPD) manifolds—especially for neuroimaging data that often suffer from distribution shifts across sessions. These data, typically represented as covariance matrices of brain signals, inherently lie on SPD manifolds due to their symmetry and positive definiteness. However, conventional domain adaptation methods often overlook this geometric structure when applied directly to covariance matrices, which can result in suboptimal performance. To address this issue, we introduce a new geometric deep learning framework that combines optimal transport theory with the geometry of SPD manifolds. Our approach aligns data distributions while respecting the manifold structure, effectively reducing both marginal and conditional discrepancies. We validate our method on three cross-session brain-computer interface datasets—KU, BNCI2014001, and BNCI2015001—where it consistently outperforms baseline approaches while maintaining the intrinsic geometry of the data. We also provide quantitative results and visualizations to better illustrate the behavior of the learned embeddings.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"345 ","pages":"Article 104347"},"PeriodicalIF":5.1,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143906599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giuseppe Spallitta , Roberto Sebastiani , Armin Biere
{"title":"Disjoint projected enumeration for SAT and SMT without blocking clauses","authors":"Giuseppe Spallitta , Roberto Sebastiani , Armin Biere","doi":"10.1016/j.artint.2025.104346","DOIUrl":"10.1016/j.artint.2025.104346","url":null,"abstract":"<div><div>All-Solution Satisfiability (AllSAT) and its extension, All-Solution Satisfiability Modulo Theories (AllSMT), have become more relevant in recent years, mainly in formal verification and artificial intelligence applications. The goal of these problems is the enumeration of all satisfying assignments of a formula (for SAT and SMT problems, respectively), making them useful for test generation, model checking, and probabilistic inference. Nevertheless, traditional AllSAT algorithms face significant computational challenges due to the exponential growth of the search space and inefficiencies caused by blocking clauses, which cause memory blowups and degrade unit propagation performance in the long term. This paper presents two novel solvers: <span>TabularAllSAT</span>, a projected AllSAT solver, and <span>TabularAllSMT</span>, a projected AllSMT solver. Both solvers combine Conflict-Driven Clause Learning (CDCL) with chronological backtracking to improve efficiency while ensuring disjoint enumeration. To retrieve compact partial assignments we propose a novel aggressive implicant shrinking algorithm, compatible with chronological backtracking, to minimize the number of partial assignments, reducing overall search complexity. Furthermore, we extend the solver framework to handle projected enumeration and SMT formulas effectively and efficiently, adapting the baseline framework to integrate theory reasoning and the distinction between important and non-important variables. An extensive experimental evaluation demonstrates the superiority of our approach compared to state-of-the-art solvers, particularly in scenarios requiring projection and SMT-based reasoning.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"345 ","pages":"Article 104346"},"PeriodicalIF":5.1,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143911498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dezhong Yao , Wanning Pan , Yuexin Shi , Michael J. O'Neill , Yutong Dai , Yao Wan , Peilin Zhao , Hai Jin , Lichao Sun
{"title":"FedHM: Efficient federated learning for heterogeneous models via low-rank factorization","authors":"Dezhong Yao , Wanning Pan , Yuexin Shi , Michael J. O'Neill , Yutong Dai , Yao Wan , Peilin Zhao , Hai Jin , Lichao Sun","doi":"10.1016/j.artint.2025.104333","DOIUrl":"10.1016/j.artint.2025.104333","url":null,"abstract":"<div><div>One underlying assumption of recent <em>Federated Learning</em> (FL) paradigms is that all local models share an identical network architecture. However, this assumption is inefficient for heterogeneous systems where devices possess varying computation and communication capabilities. The presence of such heterogeneity among devices negatively impacts the scalability of FL and slows down the training process due to the existence of stragglers. To this end, this paper proposes a novel <em>federated compression framework for heterogeneous models</em>, named FedHM, distributing the heterogeneous low-rank models to clients and then aggregating them into a full-rank global model. Furthermore, FedHM significantly reduces communication costs by utilizing low-rank models. Compared with state-of-the-art heterogeneous FL methods under various FL settings, FedHM is superior in the performance and robustness of models with different sizes. Additionally, the convergence guarantee of FL for heterogeneous devices is first theoretically analyzed.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"344 ","pages":"Article 104333"},"PeriodicalIF":5.1,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143881427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}