{"title":"Portability analysis of data mining models for fog events forecasting","authors":"G. Zazzaro","doi":"10.1002/sam.11568","DOIUrl":null,"url":null,"abstract":"This article describes an analytical method for comparing geographical sites and transferring fog forecasting models, trained by Data Mining techniques on a fixed site, across Italian airports. This portability method uses a specific intersite similarity measure based on the Euclidean distance between the performance vectors associated with each airport site. Performance vectors are useful for characterizing geographical sites. The components of a performance vector are the performance metrics of an Ensemble descriptive model. In the tests carried out, the comparison method provided very promising results, and the forecast model, when applied and evaluated on a new compatible site, shows only a small decrease in performance. The portability schema provides a meta‐learning methodology for applying predictive models to new sites where a new model cannot be trained from scratch owing to the class imbalance problem or the lack of data for a specific learning. The methodology offers a measure for clustering geographical sites and extending weather knowledge from one site to another.","PeriodicalId":342679,"journal":{"name":"Statistical Analysis and Data Mining: The ASA Data Science Journal","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining: The ASA Data Science Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/sam.11568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This article describes an analytical method for comparing geographical sites and transferring fog forecasting models, trained by Data Mining techniques on a fixed site, across Italian airports. This portability method uses a specific intersite similarity measure based on the Euclidean distance between the performance vectors associated with each airport site. Performance vectors are useful for characterizing geographical sites. The components of a performance vector are the performance metrics of an Ensemble descriptive model. In the tests carried out, the comparison method provided very promising results, and the forecast model, when applied and evaluated on a new compatible site, shows only a small decrease in performance. The portability schema provides a meta‐learning methodology for applying predictive models to new sites where a new model cannot be trained from scratch owing to the class imbalance problem or the lack of data for a specific learning. The methodology offers a measure for clustering geographical sites and extending weather knowledge from one site to another.