A unified approach to extract interpretable rules from tree ensembles via Integer Programming

IF 4.3 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Lorenzo Bonasera , Emilio Carrizosa
{"title":"A unified approach to extract interpretable rules from tree ensembles via Integer Programming","authors":"Lorenzo Bonasera ,&nbsp;Emilio Carrizosa","doi":"10.1016/j.cor.2025.107283","DOIUrl":null,"url":null,"abstract":"<div><div>Tree ensembles are widely used machine learning models, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble models do not reliably exhibit interpretable output. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model. Our approach consists of solving a set partitioning problem formulated through Integer Programming. The extracted list of rules is unweighted and defines a partition of the training data, assigning each instance to exactly one rule, and thereby simplifying the explanation process. The proposed method works with tabular or time series data, for both classification and regression tasks, and its flexible formulation can include any arbitrary loss or regularization functions. Our computational experiments offer statistically significant evidence that our method performs comparably to several rule extraction methods in terms of predictive performance and fidelity towards the tree ensemble. Moreover, we empirically show that the proposed method effectively extracts interpretable rules from tree ensembles that are designed for time series data.</div></div>","PeriodicalId":10542,"journal":{"name":"Computers & Operations Research","volume":"185 ","pages":"Article 107283"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Operations Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0305054825003120","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Tree ensembles are widely used machine learning models, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble models do not reliably exhibit interpretable output. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model. Our approach consists of solving a set partitioning problem formulated through Integer Programming. The extracted list of rules is unweighted and defines a partition of the training data, assigning each instance to exactly one rule, and thereby simplifying the explanation process. The proposed method works with tabular or time series data, for both classification and regression tasks, and its flexible formulation can include any arbitrary loss or regularization functions. Our computational experiments offer statistically significant evidence that our method performs comparably to several rule extraction methods in terms of predictive performance and fidelity towards the tree ensemble. Moreover, we empirically show that the proposed method effectively extracts interpretable rules from tree ensembles that are designed for time series data.
一种通过整数规划从树集成中提取可解释规则的统一方法
树集成是广泛使用的机器学习模型,以其在监督分类和回归任务中的有效性而闻名。它们的性能来源于多个决策树的聚合预测,这些决策树以其可解释性而闻名。然而,树集成模型不能可靠地显示可解释的输出。我们的工作旨在从训练树集成中提取优化的规则列表,为用户提供一个浓缩的、可解释的模型,该模型保留了完整模型的大部分预测能力。我们的方法包括解决一个通过整数规划制定的集划分问题。提取的规则列表是不加权的,并定义了训练数据的一个分区,将每个实例精确地分配给一个规则,从而简化了解释过程。所提出的方法适用于表格或时间序列数据,用于分类和回归任务,其灵活的公式可以包括任何任意损失或正则化函数。我们的计算实验提供了统计上显著的证据,表明我们的方法在预测性能和对树集合的保真度方面与几种规则提取方法相当。此外,我们的经验表明,该方法可以有效地从为时间序列数据设计的树集成中提取可解释的规则。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computers & Operations Research
Computers & Operations Research 工程技术-工程:工业
CiteScore
8.60
自引率
8.70%
发文量
292
审稿时长
8.5 months
期刊介绍: Operations research and computers meet in a large number of scientific fields, many of which are of vital current concern to our troubled society. These include, among others, ecology, transportation, safety, reliability, urban planning, economics, inventory control, investment strategy and logistics (including reverse logistics). Computers & Operations Research provides an international forum for the application of computers and operations research techniques to problems in these and related fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信