构造集成堆栈的一种增强方法

European Conference on Genetic Programming Pub Date : 2022-11-28 DOI:10.48550/arXiv.2211.15621

Zhi-feng Zhou, Ziyu Qiu, Bradley Niblett, A. Johnston, J. Schwartzentruber, Nur Zincir-Heywood, M. Heywood

{"title":"构造集成堆栈的一种增强方法","authors":"Zhi-feng Zhou, Ziyu Qiu, Bradley Niblett, A. Johnston, J. Schwartzentruber, Nur Zincir-Heywood, M. Heywood","doi":"10.48550/arXiv.2211.15621","DOIUrl":null,"url":null,"abstract":"An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.","PeriodicalId":206738,"journal":{"name":"European Conference on Genetic Programming","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Boosting Approach to Constructing an Ensemble Stack\",\"authors\":\"Zhi-feng Zhou, Ziyu Qiu, Bradley Niblett, A. Johnston, J. Schwartzentruber, Nur Zincir-Heywood, M. Heywood\",\"doi\":\"10.48550/arXiv.2211.15621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.\",\"PeriodicalId\":206738,\"journal\":{\"name\":\"European Conference on Genetic Programming\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Conference on Genetic Programming\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2211.15621\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Conference on Genetic Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2211.15621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

提出了一种基于进化集成学习的分类学习方法，该方法采用提升的方法构建程序堆栈。每个增强应用识别一个冠军和一个残差数据集，即到目前为止尚未正确分类的训练记录。下一个程序只针对残差进行训练，过程迭代直到达到最大集合大小或没有进一步的残差。针对残差数据集进行主动训练，降低了训练成本。将集成部署为堆栈还意味着可能只需要一个分类器来进行预测，从而提高可解释性。进行基准研究以说明当前最先进的进化集成学习算法的预测准确性的竞争力，同时提供简单数量级的解决方案。对高基数数据集的进一步基准测试表明，所提出的方法也比XGBoost更准确和高效。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Boosting Approach to Constructing an Ensemble Stack

An approach to evolutionary ensemble learning for classification is proposed in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

European Conference on Genetic Programming

自引率

0.00%

发文量