{"title":"高分辨率均方根数据电能质量分类的特征提取与特征选择研究","authors":"A. Eisenmann, T. Streubel, K. Rudion","doi":"10.1049/icp.2021.1392","DOIUrl":null,"url":null,"abstract":"This paper shows different state-of-the-art machine learning methods for structured data, applied to classification of power quality data sets. k-Nearest Neighbor, Support Vector Machine, Random Forest, XGBoost and LightGBM are chosen for comparison of classification of high resolution and root mean square data. Discrete wavelet transform and TsFresh are chosen for the pre-processing of the high-resolution data and the extraction of features. For feature selection, mutual information filtering, feature importance und sequential feature selection are tested. Special to this investigation is the use of both - highresolution waveform data (sample rate 5 kS/s) and root mean square data (20 ms). The input data were synthesized by mathematical equations. The highest score is achieved by the XGBoost classifier in combination with the LightGBM feature selector. Accuracy shows 97.71% for root mean square data and 98.96% for the high-resolution data. Furthermore, the results illustrate the dependency of the classification result on the data structure, feature extraction, feature selection and the classifier.","PeriodicalId":223615,"journal":{"name":"The 9th Renewable Power Generation Conference (RPG Dublin Online 2021)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Investigation on Feature Extraction and Feature Selection for Power Quality Classification with High Resolution and RMS Data\",\"authors\":\"A. Eisenmann, T. Streubel, K. Rudion\",\"doi\":\"10.1049/icp.2021.1392\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper shows different state-of-the-art machine learning methods for structured data, applied to classification of power quality data sets. k-Nearest Neighbor, Support Vector Machine, Random Forest, XGBoost and LightGBM are chosen for comparison of classification of high resolution and root mean square data. Discrete wavelet transform and TsFresh are chosen for the pre-processing of the high-resolution data and the extraction of features. For feature selection, mutual information filtering, feature importance und sequential feature selection are tested. Special to this investigation is the use of both - highresolution waveform data (sample rate 5 kS/s) and root mean square data (20 ms). The input data were synthesized by mathematical equations. The highest score is achieved by the XGBoost classifier in combination with the LightGBM feature selector. Accuracy shows 97.71% for root mean square data and 98.96% for the high-resolution data. Furthermore, the results illustrate the dependency of the classification result on the data structure, feature extraction, feature selection and the classifier.\",\"PeriodicalId\":223615,\"journal\":{\"name\":\"The 9th Renewable Power Generation Conference (RPG Dublin Online 2021)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 9th Renewable Power Generation Conference (RPG Dublin Online 2021)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1049/icp.2021.1392\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 9th Renewable Power Generation Conference (RPG Dublin Online 2021)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1049/icp.2021.1392","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Investigation on Feature Extraction and Feature Selection for Power Quality Classification with High Resolution and RMS Data
This paper shows different state-of-the-art machine learning methods for structured data, applied to classification of power quality data sets. k-Nearest Neighbor, Support Vector Machine, Random Forest, XGBoost and LightGBM are chosen for comparison of classification of high resolution and root mean square data. Discrete wavelet transform and TsFresh are chosen for the pre-processing of the high-resolution data and the extraction of features. For feature selection, mutual information filtering, feature importance und sequential feature selection are tested. Special to this investigation is the use of both - highresolution waveform data (sample rate 5 kS/s) and root mean square data (20 ms). The input data were synthesized by mathematical equations. The highest score is achieved by the XGBoost classifier in combination with the LightGBM feature selector. Accuracy shows 97.71% for root mean square data and 98.96% for the high-resolution data. Furthermore, the results illustrate the dependency of the classification result on the data structure, feature extraction, feature selection and the classifier.