{"title":"A Triangulation Meta-Learning Framework for Imputing Missing Values in Weather Time Series","authors":"Vinícius H. A. Alves, Marconi de Arruda Pereira","doi":"10.14393/rbcv73n4-59795","DOIUrl":null,"url":null,"abstract":"Machine learning and statistical methods can help model meteorological phenomena, especially in a context with many variables. However, it is not unusual that the measurement of those variables fails, generating data gaps and compromising data history analysis. The framework combines the predictions provided by three machine learning methods: decision trees, artificial neural networks and support vector machine, together with values calculated through five triangulation methods: arithmetic average, inverse distance weighted, optimized inverse distance weighted, optimized normal ratio and regional weight. Each machine learning algorithm generates eight regression models. One of the machine learning models makes predictions based only on the date. The remaining seven models make predictions based on one weather parameter (max. temperature, min. temperature, insolation, among others), in addition to the respective date. The triangulation methods use the climatic data from three neighboring cities to estimate the parameter of the target city. The generated dataset is, posteriorly, optimized by meta-learning algorithms. The results show that the additional information provided by the new machine learning models and the triangulation methods offered a significant increase in the accuracy of the imputed data. Moreover, the statistical analysis and coefficient of determination R² showed that the meta-learning model based on regression trees successfully combined the base-level outputs to generate outputs that best fill in the missing values of the time series studied in this paper.","PeriodicalId":36183,"journal":{"name":"Revista Brasileira de Cartografia","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Brasileira de Cartografia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14393/rbcv73n4-59795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning and statistical methods can help model meteorological phenomena, especially in a context with many variables. However, it is not unusual that the measurement of those variables fails, generating data gaps and compromising data history analysis. The framework combines the predictions provided by three machine learning methods: decision trees, artificial neural networks and support vector machine, together with values calculated through five triangulation methods: arithmetic average, inverse distance weighted, optimized inverse distance weighted, optimized normal ratio and regional weight. Each machine learning algorithm generates eight regression models. One of the machine learning models makes predictions based only on the date. The remaining seven models make predictions based on one weather parameter (max. temperature, min. temperature, insolation, among others), in addition to the respective date. The triangulation methods use the climatic data from three neighboring cities to estimate the parameter of the target city. The generated dataset is, posteriorly, optimized by meta-learning algorithms. The results show that the additional information provided by the new machine learning models and the triangulation methods offered a significant increase in the accuracy of the imputed data. Moreover, the statistical analysis and coefficient of determination R² showed that the meta-learning model based on regression trees successfully combined the base-level outputs to generate outputs that best fill in the missing values of the time series studied in this paper.