随机森林在机构研究中替代回归的预测分析方法。

Q2 Social Sciences

Practical Assessment, Research and Evaluation Pub Date : 2018-01-01 DOI:10.7275/1WPR-M024

Lingjun He, R. Levine, J. Fan, Joshua Beemer, Jeanne Stronach

{"title":"随机森林在机构研究中替代回归的预测分析方法。","authors":"Lingjun He, R. Levine, J. Fan, Joshua Beemer, Jeanne Stronach","doi":"10.7275/1WPR-M024","DOIUrl":null,"url":null,"abstract":"In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of random forest in circumstances where the regression assumptions are often violated in big data applications. Random forest is a model averaging procedure where each tree is constructed based on a bootstrap sample of the data set. In particular, we emphasize the ease of application, low computational cost, high predictive accuracy, flexibility, and interpretability of random forest machinery. Our overall recommendation is that institutional researchers look beyond classical regression and single decision tree analytics tools, and consider random forest as the predominant method for prediction tasks. The proposed points of view are detailed and illustrated through a simulation experiment and analyses of data from real institutional research projects.","PeriodicalId":20361,"journal":{"name":"Practical Assessment, Research and Evaluation","volume":"12 1","pages":"1"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research.\",\"authors\":\"Lingjun He, R. Levine, J. Fan, Joshua Beemer, Jeanne Stronach\",\"doi\":\"10.7275/1WPR-M024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of random forest in circumstances where the regression assumptions are often violated in big data applications. Random forest is a model averaging procedure where each tree is constructed based on a bootstrap sample of the data set. In particular, we emphasize the ease of application, low computational cost, high predictive accuracy, flexibility, and interpretability of random forest machinery. Our overall recommendation is that institutional researchers look beyond classical regression and single decision tree analytics tools, and consider random forest as the predominant method for prediction tasks. The proposed points of view are detailed and illustrated through a simulation experiment and analyses of data from real institutional research projects.\",\"PeriodicalId\":20361,\"journal\":{\"name\":\"Practical Assessment, Research and Evaluation\",\"volume\":\"12 1\",\"pages\":\"1\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Practical Assessment, Research and Evaluation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.7275/1WPR-M024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Practical Assessment, Research and Evaluation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7275/1WPR-M024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 36

摘要

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Random Forest as a Predictive Analytics Alternative to Regression in Institutional Research.

In institutional research, modern data mining approaches are seldom considered to address predictive analytics problems. The goal of this paper is to highlight the advantages of tree-based machine learning algorithms over classic (logistic) regression methods for data-informed decision making in higher education problems, and stress the success of random forest in circumstances where the regression assumptions are often violated in big data applications. Random forest is a model averaging procedure where each tree is constructed based on a bootstrap sample of the data set. In particular, we emphasize the ease of application, low computational cost, high predictive accuracy, flexibility, and interpretability of random forest machinery. Our overall recommendation is that institutional researchers look beyond classical regression and single decision tree analytics tools, and consider random forest as the predominant method for prediction tasks. The proposed points of view are detailed and illustrated through a simulation experiment and analyses of data from real institutional research projects.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Practical Assessment, Research and Evaluation Social Sciences-Education

CiteScore

2.60

自引率

0.00%

发文量