Tiago Lima Marinho, Diego Carvalho do Nascimento, Bruno Almeida Pimentel
{"title":"利用元学习优化 XGBoost 超参数的选择","authors":"Tiago Lima Marinho, Diego Carvalho do Nascimento, Bruno Almeida Pimentel","doi":"10.1111/exsy.13611","DOIUrl":null,"url":null,"abstract":"<p>With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta-learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta-features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real-time decision-making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta-learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta-features (general, statistical, and info-theory), so that there would be a way to compare the similarity between sets and, thus, apply meta-learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, <i>k</i> = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid-search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta-learning methodology performed better. This study, therefore, shows that the adoption of meta-learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.</p>","PeriodicalId":51053,"journal":{"name":"Expert Systems","volume":"41 9","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Optimization on selecting XGBoost hyperparameters using meta-learning\",\"authors\":\"Tiago Lima Marinho, Diego Carvalho do Nascimento, Bruno Almeida Pimentel\",\"doi\":\"10.1111/exsy.13611\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta-learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta-features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real-time decision-making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta-learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta-features (general, statistical, and info-theory), so that there would be a way to compare the similarity between sets and, thus, apply meta-learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, <i>k</i> = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid-search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta-learning methodology performed better. This study, therefore, shows that the adoption of meta-learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.</p>\",\"PeriodicalId\":51053,\"journal\":{\"name\":\"Expert Systems\",\"volume\":\"41 9\",\"pages\":\"\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2024-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/exsy.13611\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/exsy.13611","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Optimization on selecting XGBoost hyperparameters using meta-learning
With computational evolution, there has been a growth in the number of machine learning algorithms and they became more complex and robust. A greater challenge is upon faster and more practical ways to find hyperparameters that will set up each algorithm individually. This article aims to use meta-learning as a practicable solution for recommending hyperparameters from similar datasets, through their meta-features structures, than to adopt the already trained XGBoost parameters for a new database. This reduced computational costs and also aimed to make real-time decision-making feasible or reduce any extra costs for companies for new information. The experimental results, adopting 198 data sets, attested to the success of the heuristics application using meta-learning to compare datasets structure analysis. Initially, a characterization of the datasets was performed by combining three groups of meta-features (general, statistical, and info-theory), so that there would be a way to compare the similarity between sets and, thus, apply meta-learning to recommend the hyperparameters. Later, the appropriate number of sets to characterize the XGBoost turning was tested. The obtained results were promising, showing an improved performance in the accuracy of the XGBoost, k = {4 − 6}, using the average of the hyperparameters values and, comparing to the standard grid-search hyperparameters set by default, it was obtained that, in 78.28% of the datasets, the meta-learning methodology performed better. This study, therefore, shows that the adoption of meta-learning is a competitive alternative to generalize the XGBoost model, expecting better statistics performance (accuracy etc.) rather than adjusting to a single/particular model.
期刊介绍:
Expert Systems: The Journal of Knowledge Engineering publishes papers dealing with all aspects of knowledge engineering, including individual methods and techniques in knowledge acquisition and representation, and their application in the construction of systems – including expert systems – based thereon. Detailed scientific evaluation is an essential part of any paper.
As well as traditional application areas, such as Software and Requirements Engineering, Human-Computer Interaction, and Artificial Intelligence, we are aiming at the new and growing markets for these technologies, such as Business, Economy, Market Research, and Medical and Health Care. The shift towards this new focus will be marked by a series of special issues covering hot and emergent topics.