Leonardo Oldani Felix, Dionísio Henrique Carvalho de Sá Só Martins, Ulisses Admar Barbosa Vicente Monteiro, Luiz Antonio Vaz Pinto, Luís Tarrataca, Carlos Alfredo Orfão Martins
{"title":"Multiple Fault Diagnosis in a Wind Turbine Gearbox with Autoencoder Data Augmentation and KPCA Dimension Reduction","authors":"Leonardo Oldani Felix, Dionísio Henrique Carvalho de Sá Só Martins, Ulisses Admar Barbosa Vicente Monteiro, Luiz Antonio Vaz Pinto, Luís Tarrataca, Carlos Alfredo Orfão Martins","doi":"10.1007/s10921-024-01131-3","DOIUrl":null,"url":null,"abstract":"<div><p>Gearboxes, as critical components, often operate in demanding conditions, enduring constant exposure to variable loads and speeds. In the realm of condition monitoring, the dataset primarily comprises data from normal operating conditions, with significantly fewer instances of faulty conditions, resulting in imbalanced datasets. To address the challenges posed by this data disparity, researchers have proposed various solutions aimed at enhancing the performance of classification models. One such solution involves balancing the dataset before the training phase through oversampling techniques. In this study, we utilized the Sparse Autoencoder technique for data augmentation and employed Support Vector Machine (SVM) and Random Forest (RF) for classification. We conducted four experiments to evaluate the impact of data imbalance on classifier performance: (1) using the original dataset without data augmentation, (2) employing partial data augmentation, (3) applying full data augmentation, and (4) balancing the dataset while using Kernel Principal Component Analysis (KPCA) for dimensionality reduction. Our findings revealed that both algorithms achieved accuracies exceeding 90%, even when employing the original non-augmented data. When partial data augmentation was employed both algorithms were able to achieve accuracies beyond 98%. Full data augmentation yielded slightly better results compared to partial augmentation. After reducing dimensions from 18 to 11 using KPCA, both classifiers maintained robust performance. SVM achieved an overall accuracy of 98.72%, while RF achieved 96.06% accuracy.</p></div>","PeriodicalId":655,"journal":{"name":"Journal of Nondestructive Evaluation","volume":"43 4","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Nondestructive Evaluation","FirstCategoryId":"88","ListUrlMain":"https://link.springer.com/article/10.1007/s10921-024-01131-3","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, CHARACTERIZATION & TESTING","Score":null,"Total":0}
引用次数: 0
Abstract
Gearboxes, as critical components, often operate in demanding conditions, enduring constant exposure to variable loads and speeds. In the realm of condition monitoring, the dataset primarily comprises data from normal operating conditions, with significantly fewer instances of faulty conditions, resulting in imbalanced datasets. To address the challenges posed by this data disparity, researchers have proposed various solutions aimed at enhancing the performance of classification models. One such solution involves balancing the dataset before the training phase through oversampling techniques. In this study, we utilized the Sparse Autoencoder technique for data augmentation and employed Support Vector Machine (SVM) and Random Forest (RF) for classification. We conducted four experiments to evaluate the impact of data imbalance on classifier performance: (1) using the original dataset without data augmentation, (2) employing partial data augmentation, (3) applying full data augmentation, and (4) balancing the dataset while using Kernel Principal Component Analysis (KPCA) for dimensionality reduction. Our findings revealed that both algorithms achieved accuracies exceeding 90%, even when employing the original non-augmented data. When partial data augmentation was employed both algorithms were able to achieve accuracies beyond 98%. Full data augmentation yielded slightly better results compared to partial augmentation. After reducing dimensions from 18 to 11 using KPCA, both classifiers maintained robust performance. SVM achieved an overall accuracy of 98.72%, while RF achieved 96.06% accuracy.
期刊介绍:
Journal of Nondestructive Evaluation provides a forum for the broad range of scientific and engineering activities involved in developing a quantitative nondestructive evaluation (NDE) capability. This interdisciplinary journal publishes papers on the development of new equipment, analyses, and approaches to nondestructive measurements.