Nikoleta Anesti, Eleni Kalamara, George Kapetanios
{"title":"使用机器学习方法和多个大型数据集进行预测[公式省略]","authors":"Nikoleta Anesti, Eleni Kalamara, George Kapetanios","doi":"10.1016/j.ecosta.2024.08.003","DOIUrl":null,"url":null,"abstract":"The usefulness of machine learning techniques for forecasting macroeconomic variables using multiple large datasets is considered. The predictive content of surveys is compared with text-based indicators from newspaper articles and a standard macroeconomic dataset, extending the evidence on the contribution of each dataset in predicting economic activity. Among the linear models, the Ridge regression and the Partial Least Squares models report the largest gains consistently for most of the forecasting horizons, and among the non linear machine learning models, Support Vector Regression performs better at shorter horizons compared to the Neural Networks and Random Forest that yield more accurate forecasts up to two years ahead. Text based indicators have similar informational content to surveys, albeit combining the two datasets provides with more accurate forecasts for most of the forecast horizons. The largest forecasting gains are overwhelmingly concentrated at the shorter horizons for the majority of models and datasets and they decrease significantly after one year. Non-linear machine learning models appear to be mostly useful during the Great Financial Crisis and perform similarly to their linear counterparts in more normal periods.","PeriodicalId":54125,"journal":{"name":"Econometrics and Statistics","volume":"27 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Forecasting with Machine Learning methods and multiple large datasets[formula omitted]\",\"authors\":\"Nikoleta Anesti, Eleni Kalamara, George Kapetanios\",\"doi\":\"10.1016/j.ecosta.2024.08.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The usefulness of machine learning techniques for forecasting macroeconomic variables using multiple large datasets is considered. The predictive content of surveys is compared with text-based indicators from newspaper articles and a standard macroeconomic dataset, extending the evidence on the contribution of each dataset in predicting economic activity. Among the linear models, the Ridge regression and the Partial Least Squares models report the largest gains consistently for most of the forecasting horizons, and among the non linear machine learning models, Support Vector Regression performs better at shorter horizons compared to the Neural Networks and Random Forest that yield more accurate forecasts up to two years ahead. Text based indicators have similar informational content to surveys, albeit combining the two datasets provides with more accurate forecasts for most of the forecast horizons. The largest forecasting gains are overwhelmingly concentrated at the shorter horizons for the majority of models and datasets and they decrease significantly after one year. Non-linear machine learning models appear to be mostly useful during the Great Financial Crisis and perform similarly to their linear counterparts in more normal periods.\",\"PeriodicalId\":54125,\"journal\":{\"name\":\"Econometrics and Statistics\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-09-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Econometrics and Statistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1016/j.ecosta.2024.08.003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ECONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics and Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.ecosta.2024.08.003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ECONOMICS","Score":null,"Total":0}
Forecasting with Machine Learning methods and multiple large datasets[formula omitted]
The usefulness of machine learning techniques for forecasting macroeconomic variables using multiple large datasets is considered. The predictive content of surveys is compared with text-based indicators from newspaper articles and a standard macroeconomic dataset, extending the evidence on the contribution of each dataset in predicting economic activity. Among the linear models, the Ridge regression and the Partial Least Squares models report the largest gains consistently for most of the forecasting horizons, and among the non linear machine learning models, Support Vector Regression performs better at shorter horizons compared to the Neural Networks and Random Forest that yield more accurate forecasts up to two years ahead. Text based indicators have similar informational content to surveys, albeit combining the two datasets provides with more accurate forecasts for most of the forecast horizons. The largest forecasting gains are overwhelmingly concentrated at the shorter horizons for the majority of models and datasets and they decrease significantly after one year. Non-linear machine learning models appear to be mostly useful during the Great Financial Crisis and perform similarly to their linear counterparts in more normal periods.
期刊介绍:
Econometrics and Statistics is the official journal of the networks Computational and Financial Econometrics and Computational and Methodological Statistics. It publishes research papers in all aspects of econometrics and statistics and comprises of the two sections Part A: Econometrics and Part B: Statistics.