Outlier detection using flexible categorization and interrogative agendas

IF 6.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Decision Support Systems Pub Date : 2024-02-19 DOI:10.1016/j.dss.2024.114196

Marcel Boersma , Krishna Manoorkar , Alessandra Palmigiano , Mattia Panettiere , Apostolos Tzimoulis , Nachoem Wijnberg

{"title":"Outlier detection using flexible categorization and interrogative agendas","authors":"Marcel Boersma , Krishna Manoorkar , Alessandra Palmigiano , Mattia Panettiere , Apostolos Tzimoulis , Nachoem Wijnberg","doi":"10.1016/j.dss.2024.114196","DOIUrl":null,"url":null,"abstract":"<div><p>Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their <em>interrogative agenda</em>. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results.</p></div>","PeriodicalId":55181,"journal":{"name":"Decision Support Systems","volume":"180 ","pages":"Article 114196"},"PeriodicalIF":6.7000,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167923624000290/pdfft?md5=f4351ba063013ce829fe29a04ac1de27&pid=1-s2.0-S0167923624000290-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Support Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167923624000290","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Categorization is one of the basic tasks in machine learning and data analysis. Building on formal concept analysis (FCA), the starting point of the present work is that different ways to categorize a given set of objects exist, which depend on the choice of the sets of features used to classify them, and different such sets of features may yield better or worse categorizations, relative to the task at hand. In their turn, the (a priori) choice of a particular set of features over another might be subjective and express a certain epistemic stance (e.g. interests, relevance, preferences) of an agent or a group of agents, namely, their interrogative agenda. In the present paper, we represent interrogative agendas as sets of features, and explore and compare different ways to categorize objects w.r.t. different sets of features (agendas). We first develop a simple unsupervised FCA-based algorithm for outlier detection which uses categorizations arising from different agendas. We then present a supervised meta-learning algorithm to learn suitable (fuzzy) agendas for categorization as sets of features with different weights or masses. We combine this meta-learning algorithm with the unsupervised outlier detection algorithm to obtain a supervised outlier detection algorithm. We show that these algorithms perform at par with commonly used algorithms for outlier detection on commonly used datasets in outlier detection. These algorithms provide both local and global explanations of their results.

查看原文本刊更多论文

利用灵活的分类和询问议程检测离群值

分类是机器学习和数据分析的基本任务之一。在形式概念分析（FCA）的基础上，本研究工作的出发点是，对一组给定对象进行分类存在不同的方法，这取决于对用于对其进行分类的特征集的选择，相对于手头的任务而言，不同的特征集可能产生更好或更差的分类结果。反过来，（先验地）选择一组特定的特征而不是另一组，可能是主观的，表达了一个或一组代理人的某种认识论立场（如兴趣、相关性、偏好），即他们的询问议程。在本文中，我们将询问议程表示为特征集，并探索和比较了根据不同特征集（议程）对对象进行分类的不同方法。我们首先开发了一种基于 FCA 的简单无监督算法，用于离群点检测，该算法使用由不同议程产生的分类。然后，我们提出了一种有监督的元学习算法，以学习合适的（模糊）议程，将其归类为具有不同权重或质量的特征集。我们将这种元学习算法与无监督离群点检测算法相结合，得到了一种有监督的离群点检测算法。我们证明，在离群点检测的常用数据集上，这些算法与常用的离群点检测算法性能相当。这些算法对其结果提供了局部和全局的解释。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Decision Support Systems 工程技术-计算机：人工智能

CiteScore

14.70

自引率

6.70%

发文量

119

审稿时长

13 months

期刊介绍： The common thread of articles published in Decision Support Systems is their relevance to theoretical and technical issues in the support of enhanced decision making. The areas addressed may include foundations, functionality, interfaces, implementation, impacts, and evaluation of decision support systems (DSSs).