Li-xiang Duan, Mengyun Xie, Tangbo Bai, Jinjiang Wang
{"title":"Support vector data description for machinery multi-fault classification with unbalanced datasets","authors":"Li-xiang Duan, Mengyun Xie, Tangbo Bai, Jinjiang Wang","doi":"10.1109/ICPHM.2016.7542846","DOIUrl":null,"url":null,"abstract":"In mechanical fault diagnosis area, fault samples are often difficult to obtain, so the number of fault samples is far less than that of normal samples which leads to the unbalanced dataset issues. A novel model combining SVDD (Support Vector Data Description) and binary tree (BT) based on Mahalanobis distance is put forward to address the multi-classification problems under unbalanced datasets. The idea of the proposed method is to divide the original samples into a series of subsets by adopting binary tree, and then build classifier by describing the boundary of the target via SVDD. The proposed method has emphatically studied on: 1) Separability measure based on Mahalanobis distance. It represents the separability degree which takes the unbalanced degree and distance between each class into account, and takes the advantages of considering the relations among all the features of the datasets by the definition of Mahalanobis distance, it is helpful to determine the structure of the binary tree. 2) Train classifiers by using SVDD. Choose the target class according to the order of binary tree. The proposed method can be applied to multi-classification problems with unbalanced datasets issues. To validate this methodology, samples from unbalanced rotor are employed for experiment. Then, the experimental result compared with other methods is presented showing that the proposed methodology has a better performance and higher classification accuracy on multi-classification problems under unbalanced datasets.","PeriodicalId":140911,"journal":{"name":"2016 IEEE International Conference on Prognostics and Health Management (ICPHM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Prognostics and Health Management (ICPHM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPHM.2016.7542846","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Support vector data description for machinery multi-fault classification with unbalanced datasets
In mechanical fault diagnosis area, fault samples are often difficult to obtain, so the number of fault samples is far less than that of normal samples which leads to the unbalanced dataset issues. A novel model combining SVDD (Support Vector Data Description) and binary tree (BT) based on Mahalanobis distance is put forward to address the multi-classification problems under unbalanced datasets. The idea of the proposed method is to divide the original samples into a series of subsets by adopting binary tree, and then build classifier by describing the boundary of the target via SVDD. The proposed method has emphatically studied on: 1) Separability measure based on Mahalanobis distance. It represents the separability degree which takes the unbalanced degree and distance between each class into account, and takes the advantages of considering the relations among all the features of the datasets by the definition of Mahalanobis distance, it is helpful to determine the structure of the binary tree. 2) Train classifiers by using SVDD. Choose the target class according to the order of binary tree. The proposed method can be applied to multi-classification problems with unbalanced datasets issues. To validate this methodology, samples from unbalanced rotor are employed for experiment. Then, the experimental result compared with other methods is presented showing that the proposed methodology has a better performance and higher classification accuracy on multi-classification problems under unbalanced datasets.