{"title":"Learning spatio-temporal dynamics on mobility networks for adaptation to open-world events","authors":"","doi":"10.1016/j.artint.2024.104120","DOIUrl":"10.1016/j.artint.2024.104120","url":null,"abstract":"<div><p>As a decisive part in the success of Mobility-as-a-Service (MaaS), spatio-temporal dynamics modeling on mobility networks is a challenging task particularly considering scenarios where open-world events drive mobility behavior deviated from the routines. While tremendous progress has been made to model high-level spatio-temporal regularities with deep learning, most, if not all of the existing methods are neither aware of the dynamic interactions among multiple transport modes on mobility networks, nor adaptive to unprecedented volatility brought by potential open-world events. In this paper, we are therefore motivated to improve the canonical spatio-temporal network (ST-Net) from two perspectives: (1) design a heterogeneous mobility information network (HMIN) to explicitly represent intermodality in multimodal mobility; (2) propose a memory-augmented dynamic filter generator (MDFG) to generate sequence-specific parameters in an on-the-fly fashion for various scenarios. The enhanced <u>e</u>vent-<u>a</u>ware <u>s</u>patio-<u>t</u>emporal <u>net</u>work, namely <strong>EAST-Net</strong>, is evaluated on several real-world datasets with a wide variety and coverage of open-world events. Both quantitative and qualitative experimental results verify the superiority of our approach compared with the state-of-the-art baselines. What is more, experiments show generalization ability of EAST-Net to perform zero-shot inference over different open-world events that have not been seen.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"335 ","pages":"Article 104120"},"PeriodicalIF":5.1,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141043763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-graph representation for event extraction","authors":"Hui Huang , Yanping Chen , Chuan Lin , Ruizhang Huang , Qinghua Zheng , Yongbin Qin","doi":"10.1016/j.artint.2024.104144","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104144","url":null,"abstract":"<div><p>Event extraction has a trend in identifying event triggers and arguments in a unified framework, which has the advantage of avoiding the cascading failure in pipeline methods. The main problem is that joint models usually assume a one-to-one relationship between event triggers and arguments. It leads to the argument multiplexing problem, in which an argument mention can serve different roles in an event or shared by different events. To address this problem, we propose a multigraph-based event extraction framework. It allows parallel edges between any nodes, which is effective to represent semantic structures of an event. The framework enables the neural network to map a sentence(s) into a structurized semantic representation, which encodes multi-overlapped events. After evaluated on four public datasets, our method achieves the state-of-the-art performance, outperforming all compared models. Analytical experiments show that the multigraph representation is effective to address the argument multiplexing problem and helpful to advance the discriminability of the neural network for event extraction.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104144"},"PeriodicalIF":14.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140843426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme F.C.F. Almeida , José Luiz Nunes , Neele Engelmann , Alex Wiegmann , Marcelo de Araújo
{"title":"Exploring the psychology of LLMs’ moral and legal reasoning","authors":"Guilherme F.C.F. Almeida , José Luiz Nunes , Neele Engelmann , Alex Wiegmann , Marcelo de Araújo","doi":"10.1016/j.artint.2024.104145","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104145","url":null,"abstract":"<div><p>Large language models (LLMs) exhibit expert-level performance in tasks across a wide range of different domains. Ethical issues raised by LLMs and the need to align future versions makes it important to know how state of the art models reason about moral and legal issues. In this paper, we employ the methods of experimental psychology to probe into this question. We replicate eight studies from the experimental literature with instances of Google's Gemini Pro, Anthropic's Claude 2.1, OpenAI's GPT-4, and Meta's Llama 2 Chat 70b. We find that alignment with human responses shifts from one experiment to another, and that models differ amongst themselves as to their overall alignment, with GPT-4 taking a clear lead over all other models we tested. Nonetheless, even when LLM-generated responses are highly correlated to human responses, there are still systematic differences, with a tendency for models to exaggerate effects that are present among humans, in part by reducing variance. This recommends caution with regards to proposals of replacing human participants with current state-of-the-art LLMs in psychological research and highlights the need for further research about the distinctive aspects of machine psychology.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"333 ","pages":"Article 104145"},"PeriodicalIF":14.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang
{"title":"Mitigating social biases of pre-trained language models via contrastive self-debiasing with double data augmentation","authors":"Yingji Li , Mengnan Du , Rui Song , Xin Wang , Mingchen Sun , Ying Wang","doi":"10.1016/j.artint.2024.104143","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104143","url":null,"abstract":"<div><p>Pre-trained Language Models (PLMs) have been shown to inherit and even amplify the social biases contained in the training corpus, leading to undesired stereotype in real-world applications. Existing techniques for mitigating the social biases of PLMs mainly rely on data augmentation with manually designed prior knowledge or fine-tuning with abundant external corpora to debias. However, these methods are not only limited by artificial experience, but also consume a lot of resources to access all the parameters of the PLMs and are prone to introduce new external biases when fine-tuning with external corpora. In this paper, we propose a <u>C</u>ontrastive Self-<u>D</u>ebiasing Model with <u>D</u>ouble <u>D</u>ata Augmentation (named CD<sup>3</sup>) for mitigating social biases of PLMs. Specifically, CD<sup>3</sup> consists of two stages: double data augmentation and contrastive self-debiasing. First, we build on counterfactual data augmentation to perform a secondary augmentation using biased prompts that are automatically searched by maximizing the differences in PLMs' encoding across demographic groups. Double data augmentation further amplifies the biases between sample pairs to break the limitations of previous debiasing models that heavily rely on prior knowledge in data augmentation. We then leverage the augmented data for contrastive learning to train a plug-and-play adapter to mitigate the social biases in PLMs' encoding without tuning the PLMs. Extensive experimental results on BERT, ALBERT, and RoBERTa on several real-world datasets and fairness metrics show that CD<sup>3</sup> outperforms baseline models on gender debiasing and race debiasing while retaining the language modeling capabilities of PLMs.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104143"},"PeriodicalIF":14.4,"publicationDate":"2024-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140879371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Iterative voting with partial preferences","authors":"Zoi Terzopoulou , Panagiotis Terzopoulos , Ulle Endriss","doi":"10.1016/j.artint.2024.104133","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104133","url":null,"abstract":"<div><p>Voting platforms can offer participants the option to sequentially modify their preferences, whenever they have a reason to do so. But such iterative voting may never converge, meaning that a state where all agents are happy with their submitted preferences may never be reached. This problem has received increasing attention within the area of computational social choice. Yet, the relevant literature hinges on the rather stringent assumption that the agents are able to rank all alternatives they are presented with, i.e., that they hold preferences that are linear orders. We relax this assumption and investigate iterative voting under partial preferences. To that end, we define and study two families of rules that extend the well-known <em>k</em>-approval rules in the standard voting framework. Although we show that for none of these rules convergence is guaranteed in general, we also are able to identify natural conditions under which such guarantees can be given. Finally, we conduct simulation experiments to test the practical implications of our results.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104133"},"PeriodicalIF":14.4,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000699/pdfft?md5=f45969a9dc2b0460f68ac8a900765bbd&pid=1-s2.0-S0004370224000699-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140639115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matthew Wicker , Luca Laurenti , Andrea Patane , Nicola Paoletti , Alessandro Abate , Marta Kwiatkowska
{"title":"Probabilistic reach-avoid for Bayesian neural networks","authors":"Matthew Wicker , Luca Laurenti , Andrea Patane , Nicola Paoletti , Alessandro Abate , Marta Kwiatkowska","doi":"10.1016/j.artint.2024.104132","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104132","url":null,"abstract":"<div><p>Model-based reinforcement learning seeks to simultaneously learn the dynamics of an unknown stochastic environment and synthesise an optimal policy for acting in it. Ensuring the safety and robustness of sequential decisions made through a policy in such an environment is a key challenge for policies intended for safety-critical scenarios. In this work, we investigate two complementary problems: first, computing reach-avoid probabilities for iterative predictions made with dynamical models, with dynamics described by Bayesian neural network (BNN); second, synthesising control policies that are optimal with respect to a given reach-avoid specification (reaching a “target” state, while avoiding a set of “unsafe” states) and a learned BNN model. Our solution leverages interval propagation and backward recursion techniques to compute lower bounds for the probability that a policy's sequence of actions leads to satisfying the reach-avoid specification. Such computed lower bounds provide safety certification for the given policy and BNN model. We then introduce control synthesis algorithms to derive policies maximizing said lower bounds on the safety probability. We demonstrate the effectiveness of our method on a series of control benchmarks characterized by learned BNN dynamics models. On our most challenging benchmark, compared to purely data-driven policies the optimal synthesis algorithm is able to provide more than a four-fold increase in the number of certifiable states and more than a three-fold increase in the average guaranteed reach-avoid probability.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"334 ","pages":"Article 104132"},"PeriodicalIF":5.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141487483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A unified momentum-based paradigm of decentralized SGD for non-convex models and heterogeneous data","authors":"Haizhou Du, Chaoqian Cheng, Chengdong Ni","doi":"10.1016/j.artint.2024.104130","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104130","url":null,"abstract":"<div><p>Emerging distributed applications recently boosted the development of decentralized machine learning, especially in IoT and edge computing fields. In real-world scenarios, the common problems of non-convexity and data heterogeneity result in inefficiency, performance degradation, and development stagnation. The bulk of studies concentrate on one of the issues mentioned above without having a more general framework that has been proven optimal. To this end, we propose a unified paradigm called UMP, which comprises two algorithms <span>D-SUM</span> and <span>GT-DSUM</span> based on the momentum technique with decentralized stochastic gradient descent (SGD). The former provides a convergence guarantee for general non-convex objectives, while the latter is extended by introducing gradient tracking, which estimates the global optimization direction to mitigate data heterogeneity (<em>i.e.</em>, distribution drift). We can cover most momentum-based variants based on the classical heavy ball or Nesterov's acceleration with different parameters in UMP. In theory, we rigorously provide the convergence analysis of these two approaches for non-convex objectives and conduct extensive experiments, demonstrating a significant improvement in model accuracy up to 57.6% compared to other methods in practice.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104130"},"PeriodicalIF":14.4,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140639127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Discrete preference games with logic-based agents: Formal framework, complexity, and islands of tractability","authors":"Gianluigi Greco, Marco Manna","doi":"10.1016/j.artint.2024.104131","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104131","url":null,"abstract":"<div><p>Analyzing and predicting the dynamics of opinion formation in the context of social environments are problems that attracted much attention in literature. While grounded in social psychology, these problems are nowadays popular within the artificial intelligence community, where opinion dynamics are often studied via <em>game-theoretic</em> models in which individuals/agents hold opinions taken from a fixed set of <em>discrete</em> alternatives, and where the goal is to find those configurations where the opinions expressed by the agents emerge as a kind of compromise between their innate opinions and the social pressure they receive from the environments. As a matter of facts, however, these studies are based on very high-level and sometimes simplistic formalizations of the social environments, where the mental state of each individual is typically encoded as a variable taking values from a Boolean domain. To overcome these limitations, the paper proposes a framework generalizing such <em>discrete preference games</em> by modeling the reasoning capabilities of agents in terms of weighted propositional logics. It is shown that the framework easily encodes different kinds of earlier approaches and fits more expressive scenarios populated by conformist and dissenter agents. Problems related to the existence and computation of stable configurations are studied, under different theoretical assumptions on the structural shape of the social interactions and on the class of logic formulas that are allowed. Remarkably, during its trip to identify some relevant tractability islands, the paper devises a novel technical machinery whose significance goes beyond the specific application to analyzing opinion formation and diffusion, since it significantly enlarges the class of Integer Linear Programs that were known to be tractable so far.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"332 ","pages":"Article 104131"},"PeriodicalIF":14.4,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000675/pdfft?md5=266eeea1d429a8f4b48d22c14b6d529d&pid=1-s2.0-S0004370224000675-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140555675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Critical observations in model-based diagnosis","authors":"Cody James Christopher , Alban Grastien","doi":"10.1016/j.artint.2024.104116","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104116","url":null,"abstract":"<div><p>In this paper, we address the problem of finding the part of the observations that is useful for the diagnosis. We define a <em>sub-observation</em> as an abstraction of the observations. We then argue that a sub-observation is <em>sufficient</em> if it allows a diagnoser to derive the same minimal diagnosis as the original observations; and we define <em>critical observations</em> as a maximally abstracted sufficient sub-observation. We show how to compute a critical observation, and discuss a number of algorithmic improvements that also shed light on the theory of critical observations. Finally, we illustrate this framework on both state-based and event-based observations.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"331 ","pages":"Article 104116"},"PeriodicalIF":14.4,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000523/pdfft?md5=6feac947d7424f7afe8e0b763a360ed7&pid=1-s2.0-S0004370224000523-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140350627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tiantian He , Yang Liu , Yew-Soon Ong , Xiaohu Wu , Xin Luo
{"title":"Polarized message-passing in graph neural networks","authors":"Tiantian He , Yang Liu , Yew-Soon Ong , Xiaohu Wu , Xin Luo","doi":"10.1016/j.artint.2024.104129","DOIUrl":"https://doi.org/10.1016/j.artint.2024.104129","url":null,"abstract":"<div><p>In this paper, we present Polarized message-passing (PMP), a novel paradigm to revolutionize the design of message-passing graph neural networks (GNNs). In contrast to existing methods, PMP captures the power of node-node similarity and dissimilarity to acquire dual sources of messages from neighbors. The messages are then coalesced to enable GNNs to learn expressive representations from sparse but strongly correlated neighbors. Three novel GNNs based on the PMP paradigm, namely PMP graph convolutional network (PMP-GCN), PMP graph attention network (PMP-GAT), and PMP graph PageRank network (PMP-GPN) are proposed to perform various downstream tasks. Theoretical analysis is also conducted to verify the high expressiveness of the proposed PMP-based GNNs. In addition, an empirical study of five learning tasks based on 12 real-world datasets is conducted to validate the performances of PMP-GCN, PMP-GAT, and PMP-GPN. The proposed PMP-GCN, PMP-GAT, and PMP-GPN outperform numerous strong message-passing GNNs across all five learning tasks, demonstrating the effectiveness of the proposed PMP paradigm.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"331 ","pages":"Article 104129"},"PeriodicalIF":14.4,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224000651/pdfft?md5=62b63fb2137ae3e5f64fb20e2a18fdb1&pid=1-s2.0-S0004370224000651-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140350635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}