{"title":"无监督聚类技术在旋转机械故障诊断中的鲁棒性比较研究","authors":"Tapana Mekaroonkamon, S. Wongsa","doi":"10.1109/ICACI.2016.7449821","DOIUrl":null,"url":null,"abstract":"The data recorded in industry for rotating machine health monitoring are often a large number and unlabelled. It is impractical to label these data manually. Traditionally unsupervised algorithms have been applied to address this challenge. In the situation where relevant features are included or when the features are not selected properly, it could lead to poorly-separated clusters and deteriorate the clustering performance. It is of interest to investigate the performance of clustering techniques in these circumstances. This paper aims to provide a comparative study and investigation of three well-known clustering techniques, i.e. the k-means clustering algorithm, hierarchical clustering algorithm and expectation-maximisation (EM) clustering algorithm, combined with Calinski-Harabasz index, Davies-Bouldin index, Gap value index, and Silhouette index for determining the number of clusters for both well- and poorly-separated clusters. The experimental results on two real bearing datasets show that the expectation-maximisation (EM) clustering algorithm combined with the Gap value index is the most efficient and robust method to determine the optimal number of clusters in the dataset and classify the unlabelled data.","PeriodicalId":211040,"journal":{"name":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"A comparative investigation of the robustness of unsupervised clustering techniques for rotating machine fault diagnosis with poorly-separated data\",\"authors\":\"Tapana Mekaroonkamon, S. Wongsa\",\"doi\":\"10.1109/ICACI.2016.7449821\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The data recorded in industry for rotating machine health monitoring are often a large number and unlabelled. It is impractical to label these data manually. Traditionally unsupervised algorithms have been applied to address this challenge. In the situation where relevant features are included or when the features are not selected properly, it could lead to poorly-separated clusters and deteriorate the clustering performance. It is of interest to investigate the performance of clustering techniques in these circumstances. This paper aims to provide a comparative study and investigation of three well-known clustering techniques, i.e. the k-means clustering algorithm, hierarchical clustering algorithm and expectation-maximisation (EM) clustering algorithm, combined with Calinski-Harabasz index, Davies-Bouldin index, Gap value index, and Silhouette index for determining the number of clusters for both well- and poorly-separated clusters. The experimental results on two real bearing datasets show that the expectation-maximisation (EM) clustering algorithm combined with the Gap value index is the most efficient and robust method to determine the optimal number of clusters in the dataset and classify the unlabelled data.\",\"PeriodicalId\":211040,\"journal\":{\"name\":\"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACI.2016.7449821\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Eighth International Conference on Advanced Computational Intelligence (ICACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACI.2016.7449821","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comparative investigation of the robustness of unsupervised clustering techniques for rotating machine fault diagnosis with poorly-separated data
The data recorded in industry for rotating machine health monitoring are often a large number and unlabelled. It is impractical to label these data manually. Traditionally unsupervised algorithms have been applied to address this challenge. In the situation where relevant features are included or when the features are not selected properly, it could lead to poorly-separated clusters and deteriorate the clustering performance. It is of interest to investigate the performance of clustering techniques in these circumstances. This paper aims to provide a comparative study and investigation of three well-known clustering techniques, i.e. the k-means clustering algorithm, hierarchical clustering algorithm and expectation-maximisation (EM) clustering algorithm, combined with Calinski-Harabasz index, Davies-Bouldin index, Gap value index, and Silhouette index for determining the number of clusters for both well- and poorly-separated clusters. The experimental results on two real bearing datasets show that the expectation-maximisation (EM) clustering algorithm combined with the Gap value index is the most efficient and robust method to determine the optimal number of clusters in the dataset and classify the unlabelled data.