Interpretable optimisation-based approach for hyper-box classification.

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Pub Date : 2025-01-01 Epub Date: 2025-02-06 DOI:10.1007/s10994-024-06643-7

Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou

{"title":"Interpretable optimisation-based approach for hyper-box classification.","authors":"Georgios I Liapis, Sophia Tsoka, Lazaros G Papageorgiou","doi":"10.1007/s10994-024-06643-7","DOIUrl":null,"url":null,"abstract":"<p><p>Data classification is considered a fundamental research subject within the machine learning community. Researchers seek the improvement of machine learning algorithms in not only accuracy, but also interpretability. Interpretable algorithms allow humans to easily understand the decisions that a machine learning model makes, which is challenging for black box models. Mathematical programming-based classification algorithms have attracted considerable attention due to their ability to effectively compete with leading-edge algorithms in terms of both accuracy and interpretability. Meanwhile, the training of a hyper-box classifier can be mathematically formulated as a Mixed Integer Linear Programming (MILP) model and the predictions combine accuracy and interpretability. In this work, an optimisation-based approach is proposed for multi-class data classification using a hyper-box representation, thus facilitating the extraction of compact IF-THEN rules. The key novelty of our approach lies in the minimisation of the number and length of the generated rules for enhanced interpretability. Through a number of real-world datasets, it is demonstrated that the algorithm exhibits favorable performance when compared to well-known alternatives in terms of prediction accuracy and rule set simplicity.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"114 3","pages":"51"},"PeriodicalIF":2.9000,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11861270/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06643-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/6 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Data classification is considered a fundamental research subject within the machine learning community. Researchers seek the improvement of machine learning algorithms in not only accuracy, but also interpretability. Interpretable algorithms allow humans to easily understand the decisions that a machine learning model makes, which is challenging for black box models. Mathematical programming-based classification algorithms have attracted considerable attention due to their ability to effectively compete with leading-edge algorithms in terms of both accuracy and interpretability. Meanwhile, the training of a hyper-box classifier can be mathematically formulated as a Mixed Integer Linear Programming (MILP) model and the predictions combine accuracy and interpretability. In this work, an optimisation-based approach is proposed for multi-class data classification using a hyper-box representation, thus facilitating the extraction of compact IF-THEN rules. The key novelty of our approach lies in the minimisation of the number and length of the generated rules for enhanced interpretability. Through a number of real-world datasets, it is demonstrated that the algorithm exhibits favorable performance when compared to well-known alternatives in terms of prediction accuracy and rule set simplicity.

Abstract Image

查看原文本刊更多论文

基于可解释优化的超箱分类方法

数据分类被认为是机器学习领域的一个基础研究课题。研究人员不仅在准确性方面寻求机器学习算法的改进，而且在可解释性方面也寻求改进。可解释的算法使人类能够很容易地理解机器学习模型做出的决定，这对黑箱模型来说是一个挑战。基于数学规划的分类算法由于其在准确性和可解释性方面与前沿算法有效竞争的能力而引起了相当大的关注。同时，超盒分类器的训练可以在数学上表示为混合整数线性规划（MILP）模型，并且预测结合了准确性和可解释性。在这项工作中，提出了一种基于优化的方法，用于使用超盒表示的多类数据分类，从而促进了紧凑IF-THEN规则的提取。我们方法的关键新颖之处在于最小化生成规则的数量和长度，以增强可解释性。通过一些真实世界的数据集，证明了该算法在预测精度和规则集简单性方面比已知的替代算法表现出良好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.