{"title":"Out-of-distribution detection by regaining lost clues","authors":"Zhilin Zhao , Longbing Cao , Philip S. Yu","doi":"10.1016/j.artint.2024.104275","DOIUrl":"10.1016/j.artint.2024.104275","url":null,"abstract":"<div><div>Out-of-distribution (OOD) detection identifies samples in the test phase that are drawn from distributions distinct from that of training in-distribution (ID) samples for a trained network. According to the information bottleneck, networks that classify tabular data tend to extract labeling information from features with strong associations to ground-truth labels, discarding less relevant labeling cues. This behavior leads to a predicament in which OOD samples with limited labeling information receive high-confidence predictions, rendering the network incapable of distinguishing between ID and OOD samples. Hence, exploring more labeling information from ID samples, which makes it harder for an OOD sample to obtain high-confidence predictions, can address this over-confidence issue on tabular data. Accordingly, we propose a novel transformer chain (TC), which comprises a sequence of dependent transformers that iteratively regain discarded labeling information and integrate all the labeling information to enhance OOD detection. The generalization bound theoretically reveals that TC can balance ID generalization and OOD detection capabilities. Experimental results demonstrate that TC significantly surpasses state-of-the-art methods for OOD detection in tabular data.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"339 ","pages":"Article 104275"},"PeriodicalIF":5.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142867652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Formal verification and synthesis of mechanisms for social choice","authors":"Munyque Mittelmann , Bastien Maubert , Aniello Murano , Laurent Perrussel","doi":"10.1016/j.artint.2024.104272","DOIUrl":"10.1016/j.artint.2024.104272","url":null,"abstract":"<div><div>Mechanism Design (MD) aims at defining resources allocation protocols that satisfy a predefined set of properties, and Auction Mechanisms are of foremost importance. Core properties of mechanisms, such as strategy-proofness or budget balance, involve: (i) complex strategic concepts such as Nash equilibria, (ii) quantitative aspects such as utilities, and often (iii) imperfect information, with agents' private valuations. We demonstrate that Strategy Logic provides a formal framework fit to model mechanisms and express such properties, and we show that it can be used either to automatically check that a given mechanism satisfies some property (verification), or automatically produce a mechanism that does (synthesis). To do so, we consider a quantitative and variant of Strategy Logic. We first show how to express the implementation of social choice functions. Second, we show how fundamental mechanism properties can be expressed as logical formulas, and thus evaluated by model checking. We then prove that model checking for this particular variant of Strategy Logic can be done in polynomial space. Next, we show how MD can be rephrased as a synthesis problem, where mechanisms are automatically synthesized from a partial or complete logical specification. We solve the automated synthesis of mechanisms in two cases: when the number of actions is bounded, and when agents play in turns. Finally, we provide examples of auction design based for each of these two cases. The benefit of our approach in relation to classical MD is to provide a general framework for addressing a large spectrum of MD problems, which is not tailored to a particular setting or problem.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"339 ","pages":"Article 104272"},"PeriodicalIF":5.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142821057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-rank smart reserves: A general framework for selection and matching diversity goals","authors":"Haris Aziz , Zhaohong Sun","doi":"10.1016/j.artint.2024.104274","DOIUrl":"10.1016/j.artint.2024.104274","url":null,"abstract":"<div><div>We study a problem where each school has flexible multi-ranked diversity goals, and each student may belong to multiple overlapping types, and consumes only one of the positions reserved for their types. We propose a novel choice function for a school to select students and show that it is the unique rule that satisfies three fundamental properties: maximal diversity, non-wastefulness, and justified envy-freeness. We provide a fast polynomial-time algorithm for our choice function that is based on the Dulmage Mendelsohn Decomposition Theorem as well as new insights into the combinatorial structure of constrained rank maximal matchings. Even for the case of minimum and maximum quotas for types (that capture two ranks), ours is the first known polynomial-time approach to compute an optimally diverse choice outcome. Finally, we prove that the choice function we design for schools, satisfies substitutability and hence can be directly embedded in the generalized deferred acceptance algorithm to achieve strategyproofness and stability. Our algorithms and results have immediate policy implications and directly apply to a variety of scenarios, such as where hiring positions or scarce medical resources need to be allocated while taking into account diversity concerns or ethical principles.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"339 ","pages":"Article 104274"},"PeriodicalIF":5.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142867651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaoyue Wang , Xin Liu , Lijie Wang , Suhang Wu , Jinsong Su , Hua Wu
{"title":"A simple yet effective self-debiasing framework for transformer models","authors":"Xiaoyue Wang , Xin Liu , Lijie Wang , Suhang Wu , Jinsong Su , Hua Wu","doi":"10.1016/j.artint.2024.104258","DOIUrl":"10.1016/j.artint.2024.104258","url":null,"abstract":"<div><div>Current Transformer-based natural language understanding (NLU) models heavily rely on dataset biases, while failing to handle real-world out-of-distribution (OOD) instances. Many methods have been proposed to deal with this issue, but they ignore the fact that the features learned in different layers of Transformer-based NLU models are different. In this paper, we first conduct preliminary studies to obtain two conclusions: 1) both low- and high-layer sentence representations encode common biased features during training; 2) the low-layer sentence representations encode fewer unbiased features than the high-layer ones. Based on these conclusions, we propose a simple yet effective self-debiasing framework for Transformer-based NLU models. Concretely, we first stack a classifier on a selected low layer. Then, we introduce a residual connection that feeds the low-layer sentence representation to the top-layer classifier. In this way, the top-layer sentence representation will be trained to ignore the common biased features encoded by the low-layer sentence representation and focus on task-relevant unbiased features. During inference, we remove the residual connection and directly use the top-layer sentence representation to make predictions. Extensive experiments and in-depth analyses on NLU tasks demonstrate the superiority of our framework, achieving a new state-of-the-art (SOTA) on three OOD test sets.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"339 ","pages":"Article 104258"},"PeriodicalIF":5.1,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142788883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Elliot Anshelevich , Aris Filos-Ratsikas , Christopher Jerrett , Alexandros A. Voudouris
{"title":"Improved metric distortion via threshold approvals","authors":"Elliot Anshelevich , Aris Filos-Ratsikas , Christopher Jerrett , Alexandros A. Voudouris","doi":"10.1016/j.artint.2025.104295","DOIUrl":"10.1016/j.artint.2025.104295","url":null,"abstract":"<div><div>We consider a social choice setting in which agents and alternatives are represented by points in a metric space, and the cost of an agent for an alternative is the distance between the corresponding points in the space. The goal is to choose a single alternative to (approximately) minimize the social cost (cost of all agents) or the maximum cost of any agent, when only limited information about the preferences of the agents is given. Previous work has shown that the best possible distortion one can hope to achieve is 3 when access to the ordinal preferences of the agents is given, even when the distances between alternatives in the metric space are known. We improve upon this bound of 3 by designing deterministic mechanisms that exploit a bit of cardinal information. We show that it is possible to achieve distortion <span><math><mn>1</mn><mo>+</mo><msqrt><mrow><mn>2</mn></mrow></msqrt></math></span> by using the ordinal preferences of the agents, the distances between alternatives, and a threshold approval set per agent that contains all alternatives that are at distance from the agent within an appropriately chosen factor of the minimum distance of the agents from any alternative. We show that this bound is the best possible for any deterministic mechanism in general metric spaces, and also provide improved bounds for the fundamental case of a line metric.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"341 ","pages":"Article 104295"},"PeriodicalIF":5.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143072460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"TTVAE: Transformer-based generative modeling for tabular data generation","authors":"Alex X. Wang , Binh P. Nguyen","doi":"10.1016/j.artint.2025.104292","DOIUrl":"10.1016/j.artint.2025.104292","url":null,"abstract":"<div><div>Tabular data synthesis presents unique challenges, with Transformer models remaining underexplored despite the applications of Variational Autoencoders and Generative Adversarial Networks. To address this gap, we propose the Transformer-based Tabular Variational AutoEncoder (TTVAE), leveraging the attention mechanism for capturing complex data distributions. The inclusion of the attention mechanism enables our model to understand complex relationships among heterogeneous features, a task often difficult for traditional methods. TTVAE facilitates the integration of interpolation within the latent space during the data generation process. Specifically, TTVAE is trained once, establishing a low-dimensional representation of real data, and then various latent interpolation methods can efficiently generate synthetic latent points. Through extensive experiments on diverse datasets, TTVAE consistently achieves state-of-the-art performance, highlighting its adaptability across different feature types and data sizes. This innovative approach, empowered by the attention mechanism and the integration of interpolation, addresses the complex challenges of tabular data synthesis, establishing TTVAE as a powerful solution.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104292"},"PeriodicalIF":5.1,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143031440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonio Rago , Oana Cocarascu , Joel Oksanen , Francesca Toni
{"title":"Argumentative review aggregation and dialogical explanations","authors":"Antonio Rago , Oana Cocarascu , Joel Oksanen , Francesca Toni","doi":"10.1016/j.artint.2025.104291","DOIUrl":"10.1016/j.artint.2025.104291","url":null,"abstract":"<div><div>The aggregation of online reviews is one of the dominant methods of quality control for users in various domains, from retail to entertainment. Consequently, <em>explainable aggregation of reviews</em> is increasingly sought-after. We introduce quantitative argumentation technology to this setting, towards automatically generating reasoned review aggregations equipped with dialogical explanations. To this end, we define a novel form of <em>argumentative dialogical agent</em> (ADA), using ontologies to harbour information from reviews into argumentation frameworks. These agents may then be evaluated with a quantitative argumentation semantics and used to mediate the generation of dialogical explanations for item recommendations based on the reviews. We show how to deploy ADAs in three different contexts in which argumentation frameworks are mined from text, guided by ontologies. First, for hotel recommendations, we use a human-authored ontology and exemplify the potential range of dialogical explanations afforded by ADAs. Second, for movie recommendations, we empirically evaluate an ADA based on a bespoke ontology (extracted semi-automatically, by natural language processing), by demonstrating that its quantitative evaluations, which are shown to satisfy desirable theoretical properties, are comparable with those on a well-known movie review aggregation website. Finally, for product recommendation in e-commerce, we use another bespoke ontology (extracted fully automatically, by natural language processing, from a website's reviews) to construct an ADA which is then empirically evaluated favourably against review aggregations from the website.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104291"},"PeriodicalIF":5.1,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143318247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Maximum Likelihood Evidential Reasoning","authors":"Jian-Bo Yang, Dong-Ling Xu","doi":"10.1016/j.artint.2025.104289","DOIUrl":"10.1016/j.artint.2025.104289","url":null,"abstract":"<div><div>In this paper, we aim at generalising the <em><u>e</u></em>vidential <em><u>r</u></em>easoning (<em>ER</em>) rule to establish a new <em><u>ma</u></em>ximum li<em><u>k</u></em>elihood <em><u>e</u></em>vidential <em><u>r</u></em>easoning (<em>MAKER</em>) framework for probabilistic inference from inputs to outputs in a system space, with their relationships characterised by imperfect data. The <em>MAKER</em> framework consists of three models: <em><u>s</u></em>ystem <em><u>s</u></em>tate <em><u>m</u></em>odel (<em>SSM</em>), <em><u>e</u>vidence <u>a</u></em>cquisition <em><u>m</u></em>odel (<em>EAM</em>) and <em><u>e</u></em>vidential <em><u>r</u></em>easoning <em><u>m</u></em>odel (<em>ERM</em>). <em>SSM</em> is introduced to describe system output in the form of ordinary probability distribution on singleton states of the system space to model randomness only, or more generally basic probability distribution on singleton states and their subsets, referred to as states for short, to depict both randomness and ambiguity explicitly. <em>EAM</em> is established to acquire evidence from a data source as system input in the form of basic probability distribution on the evidential elements of the data source, with each evidential element pointing to a state in the system space. <em>ERM</em> is created to combine pieces of acquired evidence, with each represented in the form of basic probability distribution on all the states and the powerset of the system space to facilitate an augmented probabilistic inference process where the trustworthiness of evidence is explicitly modelled alongside its randomness and ambiguity.</div><div>Within the <em>MAKER</em> framework, the trustworthiness of evidence is defined in terms of its reliability and expected weight to measure the total degree of its support for all states. Interdependence between pairs of evidence is also measured explicitly. A general conjunctive <em>MAKER</em> rule and algorithm are then established to infer system output from multiple inputs by combining multiple pieces of evidence that have weights and reliabilities and are dependent on each other in general. Several special <em>MAKER</em> rules and algorithms are deduced to facilitate inference in special situations where evidence is exclusive or independent of each other. Specific conditions are identified and proven where the <em>MAKER</em> rule reduces to the <em>ER</em> rule, Dempster's rule and Bayes’ rule. A bi-objective nonlinear pre-emptive minimax optimisation model is built to make use of observed data for optimal learning of evidence weights and reliabilities by maximising the predicted likelihood of the true state for each observation. Two numerical examples are analysed to demonstrate the three constituent models of the <em>MAKER</em> framework, the <em>MAKER</em> rules and algorithms, and the optimal learning model. A case study for human well-being analysis is provided where data from a panel survey are used to show t","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104289"},"PeriodicalIF":5.1,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143317733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning a fast 3D spectral approach to object segmentation and tracking over space and time","authors":"Elena Burceanu , Marius Leordeanu","doi":"10.1016/j.artint.2024.104281","DOIUrl":"10.1016/j.artint.2024.104281","url":null,"abstract":"<div><div>We pose video object segmentation as spectral graph clustering in space and time, with one graph node for each pixel and edges forming local space-time neighborhoods. We claim that the strongest cluster in this video graph represents the salient object. We start by introducing a novel and efficient method based on 3D filtering for approximating the spectral solution, as the principal eigenvector of the graph's adjacency matrix, without explicitly building the matrix. This key property allows us to have a fast parallel implementation on GPU, orders of magnitude faster than classical approaches for computing the eigenvector. Our motivation for a spectral space-time clustering approach, unique in video semantic segmentation literature, is that such clustering is dedicated to preserving object consistency over time, which we evaluate using our novel segmentation consistency measure. Further on, we show how to efficiently learn the solution over multiple input feature channels. Finally, we extend the formulation of our approach beyond the segmentation task, into the realm of object tracking. In extensive experiments we show significant improvements over top methods, as well as over powerful ensembles that combine them, achieving state-of-the-art on multiple benchmarks, both for tracking and segmentation.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104281"},"PeriodicalIF":5.1,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143318248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Meike Zehlike , Alex Loosley , Håkan Jonsson , Emil Wiedemann , Philipp Hacker
{"title":"Beyond incompatibility: Trade-offs between mutually exclusive fairness criteria in machine learning and law","authors":"Meike Zehlike , Alex Loosley , Håkan Jonsson , Emil Wiedemann , Philipp Hacker","doi":"10.1016/j.artint.2024.104280","DOIUrl":"10.1016/j.artint.2024.104280","url":null,"abstract":"<div><div>Fair and trustworthy AI is becoming ever more important in both machine learning and legal domains. One important consequence is that decision makers must seek to guarantee a ‘fair’, i.e., non-discriminatory, algorithmic decision procedure. However, there are several competing notions of algorithmic fairness that have been shown to be mutually incompatible under realistic factual assumptions. This concerns, for example, the widely used fairness measures of ‘calibration within groups’ and ‘balance for the positive/negative class,’ which relate to accuracy, false negative and false positive rates, respectively. In this paper, we present a novel algorithm (FAir Interpolation Method: FAIM) for continuously interpolating between these three fairness criteria. Thus, an initially unfair prediction can be remedied to meet, at least partially, a desired, weighted combination of the respective fairness conditions. We demonstrate the effectiveness of our algorithm when applied to synthetic data, the COMPAS data set, and a new, real-world data set from the e-commerce sector. We provide guidance on using our algorithm in different high-stakes contexts, and we discuss to what extent FAIM can be harnessed to comply with conflicting legal obligations. The analysis suggests that it may operationalize duties in traditional legal fields, such as credit scoring and criminal justice proceedings, but also for the latest AI regulations put forth in the EU, like the Digital Markets Act and the recently enacted AI Act.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104280"},"PeriodicalIF":5.1,"publicationDate":"2025-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142967819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}