{"title":"Formal concept views for explainable boosting: A lattice-theoretic framework for Extreme Gradient Boosting and Gradient Boosting Models","authors":"Sherif Eneye Shuaib , Pakwan Riyapan , Jirapond Muangprathub","doi":"10.1016/j.iswa.2025.200569","DOIUrl":null,"url":null,"abstract":"<div><div>Tree-based ensemble methods, such as Extreme Gradient Boosting (XGBoost) and Gradient Boosting models (GBM), are widely used for supervised learning due to their strong predictive capabilities. However, their complex architectures often hinder interpretability. This paper extends a lattice-theoretic framework originally developed for Random Forests to boosting algorithms, enabling a structured analysis of their internal logic via formal concept analysis (FCA).</div><div>We formally adapt four conceptual views: leaf, tree, tree predicate, and interordinal predicate to account for the sequential learning and optimization processes unique to boosting. Using the binary-class version of the car evaluation dataset from the OpenML CC18 benchmark suite, we conduct a systematic parameter study to examine how hyperparameters, such as tree depth and the number of trees, affect both model performance and conceptual complexity. Random Forest results from prior literature are used as a comparative baseline.</div><div>The results show that XGBoost yields the highest test accuracy, while GBM demonstrates greater stability in generalization error. Conceptually, boosting methods generate more compact and interpretable leaf views but preserve rich structural information in higher-level views. In contrast, Random Forests tend to produce denser and more redundant concept lattices. These trade-offs highlight how boosting methods, when interpreted through FCA, can strike a balance between performance and transparency.</div><div>Overall, this work contributes to explainable AI by demonstrating how lattice-based conceptual views can be systematically extended to complex boosting models, offering interpretable insights without sacrificing predictive power.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"27 ","pages":"Article 200569"},"PeriodicalIF":4.3000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266730532500095X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Tree-based ensemble methods, such as Extreme Gradient Boosting (XGBoost) and Gradient Boosting models (GBM), are widely used for supervised learning due to their strong predictive capabilities. However, their complex architectures often hinder interpretability. This paper extends a lattice-theoretic framework originally developed for Random Forests to boosting algorithms, enabling a structured analysis of their internal logic via formal concept analysis (FCA).
We formally adapt four conceptual views: leaf, tree, tree predicate, and interordinal predicate to account for the sequential learning and optimization processes unique to boosting. Using the binary-class version of the car evaluation dataset from the OpenML CC18 benchmark suite, we conduct a systematic parameter study to examine how hyperparameters, such as tree depth and the number of trees, affect both model performance and conceptual complexity. Random Forest results from prior literature are used as a comparative baseline.
The results show that XGBoost yields the highest test accuracy, while GBM demonstrates greater stability in generalization error. Conceptually, boosting methods generate more compact and interpretable leaf views but preserve rich structural information in higher-level views. In contrast, Random Forests tend to produce denser and more redundant concept lattices. These trade-offs highlight how boosting methods, when interpreted through FCA, can strike a balance between performance and transparency.
Overall, this work contributes to explainable AI by demonstrating how lattice-based conceptual views can be systematically extended to complex boosting models, offering interpretable insights without sacrificing predictive power.