{"title":"Joint learning of reward machines and policies in environments with partially known semantics","authors":"Christos K. Verginis , Cevahir Koprulu , Sandeep Chinchali , Ufuk Topcu","doi":"10.1016/j.artint.2024.104146","DOIUrl":"10.1016/j.artint.2024.104146","url":null,"abstract":"<div><p>We study the problem of reinforcement learning for a task encoded by a reward machine. The task is defined over a set of properties in the environment, called atomic propositions, and represented by Boolean variables. One unrealistic assumption commonly used in the literature is that the truth values of these propositions are accurately known. In real situations, however, these truth values are uncertain since they come from sensors that suffer from imperfections. At the same time, reward machines can be difficult to model explicitly, especially when they encode complicated tasks. We develop a reinforcement-learning algorithm that infers a reward machine that encodes the underlying task while learning how to execute it, despite the uncertainties of the propositions' truth values. In order to address such uncertainties, the algorithm maintains a probabilistic estimate about the truth value of the atomic propositions; it updates this estimate according to new sensory measurements that arrive from exploration of the environment. Additionally, the algorithm maintains a hypothesis reward machine, which acts as an estimate of the reward machine that encodes the task to be learned. As the agent explores the environment, the algorithm updates the hypothesis reward machine according to the obtained rewards and the estimate of the atomic propositions' truth value. Finally, the algorithm uses a Q-learning procedure for the states of the hypothesis reward machine to determine an optimal policy that accomplishes the task. We prove that the algorithm successfully infers the reward machine and asymptotically learns a policy that accomplishes the respective task.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104146"},"PeriodicalIF":14.4,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000821/pdfft?md5=00403f012b025daac195daf945ec2715&pid=1-s2.0-S0004370224000821-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141178018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gianvincenzo Alfano , Andrea Cohen , Sebastian Gottifredi , Sergio Greco , Francesco Parisi , Guillermo R. Simari
{"title":"Credulous acceptance in high-order argumentation frameworks with necessities: An incremental approach","authors":"Gianvincenzo Alfano , Andrea Cohen , Sebastian Gottifredi , Sergio Greco , Francesco Parisi , Guillermo R. Simari","doi":"10.1016/j.artint.2024.104159","DOIUrl":"10.1016/j.artint.2024.104159","url":null,"abstract":"<div><p>Argumentation is an important research area in the field of AI. There is a substantial amount of work on different aspects of Dung's abstract Argumentation Framework (AF). Two relevant aspects considered separately so far are: <em>i</em>) extending the framework to account for recursive attacks and supports, and <span><math><mi>i</mi><mi>i</mi><mo>)</mo></math></span> considering dynamics, <em>i.e.</em>, AFs evolving over time. In this paper, we jointly deal with these two aspects. We focus on High-Order Argumentation Frameworks with Necessities (HOAFNs) which allow for attack and support relations (interpreted as <em>necessity</em>) not only between arguments but also targeting attacks and supports at any level. We propose an approach for the incremental evaluation of the credulous acceptance problem in HOAFNs, by “incrementally” computing an extension (a set of accepted arguments, attacks and supports), if it exists, containing a given goal element in an updated HOAFN. In particular, we are interested in monitoring the credulous acceptance of a given argument, attack or support (goal) in an evolving HOAFN. Thus, our approach assumes to have a HOAFN Δ, a goal <em>ϱ</em> occurring in Δ, an extension <em>E</em> for Δ containing <em>ϱ</em>, and an update <em>u</em> establishing some changes in the original HOAFN, and uses the extension for first checking whether the update is relevant; for relevant updates, an extension of the updated HOAFN containing the goal is computed by translating the problem to the AF domain and leveraging on AF solvers. We provide formal results for our incremental approach and empirically show that it outperforms the evaluation from scratch of the credulous acceptance problem for an updated HOAFN.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104159"},"PeriodicalIF":14.4,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sara Bernardini , Fabio Fagnani , Alexandra Neacsu , Santiago Franco
{"title":"Optimizing pathfinding for goal legibility and recognition in cooperative partially observable environments","authors":"Sara Bernardini , Fabio Fagnani , Alexandra Neacsu , Santiago Franco","doi":"10.1016/j.artint.2024.104148","DOIUrl":"10.1016/j.artint.2024.104148","url":null,"abstract":"<div><p>In this paper, we perform a joint design of goal legibility and recognition in a cooperative, multi-agent pathfinding setting with partial observability. More specifically, we consider a set of identical agents (the actors) that move in an environment only partially observable to an observer in the loop. The actors are tasked with reaching a set of locations that need to be serviced in a timely fashion. The observer monitors the actors' behavior from a distance and needs to identify each actor's destination based on the actor's observable movements. Our approach generates legible paths for the actors; namely, it constructs one path from the origin to each destination so that these paths overlap as little as possible while satisfying budget constraints. It also equips the observer with a goal-recognition mapping between unique sequences of observations and destinations, ensuring that the observer can infer an actor's destination by making the minimum number of observations (legibility delay). Our method substantially extends previous work, which is limited to an observer with full observability, showing that optimizing pathfinding for goal legibility and recognition can be performed via a reformulation into a classical minimum cost flow problem in the partially observable case when the algorithms for the fully observable case are appropriately modified. Our empirical evaluation shows that our techniques are as effective in partially observable settings as in fully observable ones.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104148"},"PeriodicalIF":14.4,"publicationDate":"2024-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000845/pdfft?md5=66bd75617c41f8c0d650bfa7aefc5bfd&pid=1-s2.0-S0004370224000845-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141136365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mutian He, Tianqing Fang, Weiqi Wang, Yangqiu Song
{"title":"Acquiring and modeling abstract commonsense knowledge via conceptualization","authors":"Mutian He, Tianqing Fang, Weiqi Wang, Yangqiu Song","doi":"10.1016/j.artint.2024.104149","DOIUrl":"10.1016/j.artint.2024.104149","url":null,"abstract":"<div><p>Conceptualization, or viewing entities and situations as instances of abstract concepts in mind and making inferences based on that, is a vital component in human intelligence for commonsense reasoning. Despite recent progress in artificial intelligence to acquire and model commonsense attributed to neural language models and commonsense knowledge graphs (CKGs), conceptualization is yet to be introduced thoroughly, making current approaches ineffective to cover knowledge about countless diverse entities and situations in the real world. To address the problem, we thoroughly study the role of conceptualization in commonsense reasoning, and formulate a framework to replicate human conceptual induction by acquiring abstract knowledge about events regarding abstract concepts, as well as higher-level triples or inferences upon them. We then apply the framework to ATOMIC, a large-scale human-annotated CKG, aided by the taxonomy Probase. We annotate a dataset on the validity of contextualized conceptualizations from ATOMIC on both event and triple levels, develop a series of heuristic rules based on linguistic features, and train a set of neural models to generate and verify abstract knowledge. Based on these components, a pipeline to acquire abstract knowledge is built. A large abstract CKG upon ATOMIC is then induced, ready to be instantiated to infer about unseen entities or situations. Finally, we empirically show the benefits of augmenting CKGs with abstract knowledge in downstream tasks like commonsense inference and zero-shot commonsense QA.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104149"},"PeriodicalIF":14.4,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141027260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge is power: Open-world knowledge representation learning for knowledge-based visual reasoning","authors":"Wenbo Zheng , Lan Yan , Fei-Yue Wang","doi":"10.1016/j.artint.2024.104147","DOIUrl":"10.1016/j.artint.2024.104147","url":null,"abstract":"<div><p>Knowledge-based visual reasoning requires the ability to associate outside knowledge that is not present in a given image for cross-modal visual understanding. Two deficiencies of the existing approaches are that (1) they only employ or construct elementary and <em>explicit</em> but superficial knowledge graphs while lacking complex and <em>implicit</em> but indispensable cross-modal knowledge for visual reasoning, and (2) they also cannot reason new/<em>unseen</em> images or questions in open environments and are often violated in real-world applications. How to represent and leverage tacit multimodal knowledge for open-world visual reasoning scenarios has been less studied. In this paper, we propose a novel open-world knowledge representation learning method to not only construct implicit knowledge representations from the given images and their questions but also enable knowledge transfer from a <em>known</em> given scene to an <em>unknown</em> scene for answer prediction. Extensive experiments conducted on six benchmarks demonstrate the superiority of our approach over other state-of-the-art methods. We apply our approach to other visual reasoning tasks, and the experimental results show that our approach, with its good performance, can support related reasoning applications.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104147"},"PeriodicalIF":14.4,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140949791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning spatio-temporal dynamics on mobility networks for adaptation to open-world events","authors":"","doi":"10.1016/j.artint.2024.104120","DOIUrl":"10.1016/j.artint.2024.104120","url":null,"abstract":"<div><p>As a decisive part in the success of Mobility-as-a-Service (MaaS), spatio-temporal dynamics modeling on mobility networks is a challenging task particularly considering scenarios where open-world events drive mobility behavior deviated from the routines. While tremendous progress has been made to model high-level spatio-temporal regularities with deep learning, most, if not all of the existing methods are neither aware of the dynamic interactions among multiple transport modes on mobility networks, nor adaptive to unprecedented volatility brought by potential open-world events. In this paper, we are therefore motivated to improve the canonical spatio-temporal network (ST-Net) from two perspectives: (1) design a heterogeneous mobility information network (HMIN) to explicitly represent intermodality in multimodal mobility; (2) propose a memory-augmented dynamic filter generator (MDFG) to generate sequence-specific parameters in an on-the-fly fashion for various scenarios. The enhanced <u>e</u>vent-<u>a</u>ware <u>s</u>patio-<u>t</u>emporal <u>net</u>work, namely <strong>EAST-Net</strong>, is evaluated on several real-world datasets with a wide variety and coverage of open-world events. Both quantitative and qualitative experimental results verify the superiority of our approach compared with the state-of-the-art baselines. What is more, experiments show generalization ability of EAST-Net to perform zero-shot inference over different open-world events that have not been seen.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"335 ","pages":"Article 104120"},"PeriodicalIF":5.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141043763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-graph representation for event extraction","authors":"Hui Huang , Yanping Chen , Chuan Lin , Ruizhang Huang , Qinghua Zheng , Yongbin Qin","doi":"10.1016/j.artint.2024.104144","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104144","url":null,"abstract":"<div><p>Event extraction has a trend in identifying event triggers and arguments in a unified framework, which has the advantage of avoiding the cascading failure in pipeline methods. The main problem is that joint models usually assume a one-to-one relationship between event triggers and arguments. It leads to the argument multiplexing problem, in which an argument mention can serve different roles in an event or shared by different events. To address this problem, we propose a multigraph-based event extraction framework. It allows parallel edges between any nodes, which is effective to represent semantic structures of an event. The framework enables the neural network to map a sentence(s) into a structurized semantic representation, which encodes multi-overlapped events. After evaluated on four public datasets, our method achieves the state-of-the-art performance, outperforming all compared models. Analytical experiments show that the multigraph representation is effective to address the argument multiplexing problem and helpful to advance the discriminability of the neural network for event extraction.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104144"},"PeriodicalIF":14.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140843426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme F.C.F. Almeida , José Luiz Nunes , Neele Engelmann , Alex Wiegmann , Marcelo de Araújo
{"title":"Exploring the psychology of LLMs’ moral and legal reasoning","authors":"Guilherme F.C.F. Almeida , José Luiz Nunes , Neele Engelmann , Alex Wiegmann , Marcelo de Araújo","doi":"10.1016/j.artint.2024.104145","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104145","url":null,"abstract":"<div><p>Large language models (LLMs) exhibit expert-level performance in tasks across a wide range of different domains. Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues. In this paper, we employ the methods of experimental psychology to probe into this question. We replicate eight studies from the experimental literature with instances of Google's Gemini Pro, Anthropic's Claude 2.1, OpenAI's GPT-4, and Meta's Llama 2 Chat 70b. We find that alignment with human responses shifts from one experiment to another, and that models differ amongst themselves as to their overall alignment, with GPT-4 taking a clear lead over all other models we tested. Nonetheless, even when LLM-generated responses are highly correlated to human responses, there are still systematic differences, with a tendency for models to exaggerate effects that are present among humans, in part by reducing variance. This recommends caution with regards to proposals of replacing human participants with current state-of-the-art LLMs in psychological research and highlights the need for further research about the distinctive aspects of machine psychology.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104145"},"PeriodicalIF":14.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang
{"title":"Mitigating social biases of pre-trained language models via contrastive self-debiasing with double data augmentation","authors":"Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang","doi":"10.1016/j.artint.2024.104143","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104143","url":null,"abstract":"<div><p>Pre-trained Language Models (PLMs) have been shown to inherit and even amplify the social biases contained in the training corpus, leading to undesired stereotype in real-world applications. Existing techniques for mitigating the social biases of PLMs mainly rely on data augmentation with manually designed prior knowledge or fine-tuning with abundant external corpora to debias. However, these methods are not only limited by artificial experience, but also consume a lot of resources to access all the parameters of the PLMs and are prone to introduce new external biases when fine-tuning with external corpora. In this paper, we propose a <u>C</u>ontrastive Self-<u>D</u>ebiasing Model with <u>D</u>ouble <u>D</u>ata Augmentation (named CD<sup>3</sup>) for mitigating social biases of PLMs. Specifically, CD<sup>3</sup> consists of two stages: double data augmentation and contrastive self-debiasing. First, we build on counterfactual data augmentation to perform a secondary augmentation using biased prompts that are automatically searched by maximizing the differences in PLMs' encoding across demographic groups. Double data augmentation further amplifies the biases between sample pairs to break the limitations of previous debiasing models that heavily rely on prior knowledge in data augmentation. We then leverage the augmented data for contrastive learning to train a plug-and-play adapter to mitigate the social biases in PLMs' encoding without tuning the PLMs. Extensive experimental results on BERT, ALBERT, and RoBERTa on several real-world datasets and fairness metrics show that CD<sup>3</sup> outperforms baseline models on gender debiasing and race debiasing while retaining the language modeling capabilities of PLMs.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104143"},"PeriodicalIF":14.4,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140879371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iterative voting with partial preferences","authors":"Zoi Terzopoulou , Panagiotis Terzopoulos , Ulle Endriss","doi":"10.1016/j.artint.2024.104133","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104133","url":null,"abstract":"<div><p>Voting platforms can offer participants the option to sequentially modify their preferences, whenever they have a reason to do so. But such iterative voting may never converge, meaning that a state where all agents are happy with their submitted preferences may never be reached. This problem has received increasing attention within the area of computational social choice. Yet, the relevant literature hinges on the rather stringent assumption that the agents are able to rank all alternatives they are presented with, i.e., that they hold preferences that are linear orders. We relax this assumption and investigate iterative voting under partial preferences. To that end, we define and study two families of rules that extend the well-known <em>k</em>-approval rules in the standard voting framework. Although we show that for none of these rules convergence is guaranteed in general, we also are able to identify natural conditions under which such guarantees can be given. Finally, we conduct simulation experiments to test the practical implications of our results.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104133"},"PeriodicalIF":14.4,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000699/pdfft?md5=f45969a9dc2b0460f68ac8a900765bbd&pid=1-s2.0-S0004370224000699-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140639115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}