Quoc-Thang Phan, Yuan-Kang Wu, Q. Phan, Hsin-Yen Lo
{"title":"改进逐时太阳数据集缺失数据的方法研究","authors":"Quoc-Thang Phan, Yuan-Kang Wu, Q. Phan, Hsin-Yen Lo","doi":"10.1109/ICASI55125.2022.9774453","DOIUrl":null,"url":null,"abstract":"In the era of big data, large period of missing data is a common problem which affect the data quality and final forecasting results if not handled properly. Therefore, filling missing data in datasets is importance since the most of real-time datasets have a huge number of missing values. This paper first gives a comprehensive overview of various imputation methods for filling missing data. Then proposes a technique based on a popular Multivariate Imputation by Chained Equation (MICE) to fill numeric data in PV dataset. Finally analyses the impact of this technique and compares the performance with other imputation algorithms. For practice, this study uses historical measurement PV generation from the North PV site of Taiwan, and Numerical Weather Prediction (NWP) data consists of solar irradiance, temperature, sea level pressure, humidity, rainfall, wind speed. The NWP dataset is provided by Taiwan Central Weather Bureau (CWB) which is called Deterministic Weather Research and Forecasting (WRFD). Experimental results showed that the proposed imputation algorithm can improve short-term PV generation forecasting accuracy based on RMSE.","PeriodicalId":190229,"journal":{"name":"2022 8th International Conference on Applied System Innovation (ICASI)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Study on Missing Data Imputation Methods for Improving Hourly Solar Dataset\",\"authors\":\"Quoc-Thang Phan, Yuan-Kang Wu, Q. Phan, Hsin-Yen Lo\",\"doi\":\"10.1109/ICASI55125.2022.9774453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the era of big data, large period of missing data is a common problem which affect the data quality and final forecasting results if not handled properly. Therefore, filling missing data in datasets is importance since the most of real-time datasets have a huge number of missing values. This paper first gives a comprehensive overview of various imputation methods for filling missing data. Then proposes a technique based on a popular Multivariate Imputation by Chained Equation (MICE) to fill numeric data in PV dataset. Finally analyses the impact of this technique and compares the performance with other imputation algorithms. For practice, this study uses historical measurement PV generation from the North PV site of Taiwan, and Numerical Weather Prediction (NWP) data consists of solar irradiance, temperature, sea level pressure, humidity, rainfall, wind speed. The NWP dataset is provided by Taiwan Central Weather Bureau (CWB) which is called Deterministic Weather Research and Forecasting (WRFD). Experimental results showed that the proposed imputation algorithm can improve short-term PV generation forecasting accuracy based on RMSE.\",\"PeriodicalId\":190229,\"journal\":{\"name\":\"2022 8th International Conference on Applied System Innovation (ICASI)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 8th International Conference on Applied System Innovation (ICASI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASI55125.2022.9774453\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 8th International Conference on Applied System Innovation (ICASI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASI55125.2022.9774453","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Study on Missing Data Imputation Methods for Improving Hourly Solar Dataset
In the era of big data, large period of missing data is a common problem which affect the data quality and final forecasting results if not handled properly. Therefore, filling missing data in datasets is importance since the most of real-time datasets have a huge number of missing values. This paper first gives a comprehensive overview of various imputation methods for filling missing data. Then proposes a technique based on a popular Multivariate Imputation by Chained Equation (MICE) to fill numeric data in PV dataset. Finally analyses the impact of this technique and compares the performance with other imputation algorithms. For practice, this study uses historical measurement PV generation from the North PV site of Taiwan, and Numerical Weather Prediction (NWP) data consists of solar irradiance, temperature, sea level pressure, humidity, rainfall, wind speed. The NWP dataset is provided by Taiwan Central Weather Bureau (CWB) which is called Deterministic Weather Research and Forecasting (WRFD). Experimental results showed that the proposed imputation algorithm can improve short-term PV generation forecasting accuracy based on RMSE.