非平衡无标记样本类别信息筛选下的半监督机电传动系统故障诊断。

IF 2 3区物理与天体物理 Q2 PHYSICS, MULTIDISCIPLINARY

Entropy Pub Date : 2025-02-06 DOI:10.3390/e27020175

Chaoge Wang, Pengpeng Jia, Xinyu Tian, Xiaojing Tang, Xiong Hu, Hongkun Li

{"title":"非平衡无标记样本类别信息筛选下的半监督机电传动系统故障诊断。","authors":"Chaoge Wang, Pengpeng Jia, Xinyu Tian, Xiaojing Tang, Xiong Hu, Hongkun Li","doi":"10.3390/e27020175","DOIUrl":null,"url":null,"abstract":"In the health monitoring of electromechanical transmission systems, the collected state data typically consist of only a minimal amount of labeled data, with a vast majority remaining unlabeled. Consequently, deep learning-based diagnostic models encounter the challenge of scarcity in labeled data and abundance in unlabeled data. Traditional semi-supervised deep learning methods based on pseudo-label self-training, while alleviating the issue of labeled data scarcity to some extent, neglect the reliability of pseudo-label information, the accuracy of feature extraction from unlabeled data, and the imbalance in sample selection. To address these issues, this paper proposes a novel semi-supervised fault diagnosis method under imbalanced unlabeled sample class information screening. Firstly, an information screening mechanism for unlabeled data based on active learning is established. This mechanism discriminates based on the variability of intrinsic feature information in fault samples, accurately screening out unlabeled samples located near decision boundaries that are difficult to separate clearly. Then, combining the maximum membership degree of these unlabeled data in the classification space of the supervised model and interacting with the active learning expert system, label information is assigned to the screened unlabeled data. Secondly, a cost-sensitive function driven by data imbalance is constructed to address the class imbalance problem in unlabeled sample screening, adaptively adjusting the weights of different class samples during model training to guide the training of the supervised model. Ultimately, through dynamic optimization of the supervised model and the feature extraction capability of unlabeled samples, the recognition ability of the diagnostic model for unlabeled samples is significantly enhanced. Validation through two datasets, encompassing a total of 12 experimental scenarios, demonstrates that in scenarios with only a small amount of labeled data, the proposed method achieves a diagnostic accuracy increment exceeding 10% compared to existing typical methods, fully validating the effectiveness and superiority of the proposed method in practical applications.","PeriodicalId":11694,"journal":{"name":"Entropy","volume":"27 2","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11854703/pdf/","citationCount":"0","resultStr":"{\"title\":\"Fault Diagnosis of Semi-Supervised Electromechanical Transmission Systems Under Imbalanced Unlabeled Sample Class Information Screening.\",\"authors\":\"Chaoge Wang, Pengpeng Jia, Xinyu Tian, Xiaojing Tang, Xiong Hu, Hongkun Li\",\"doi\":\"10.3390/e27020175\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the health monitoring of electromechanical transmission systems, the collected state data typically consist of only a minimal amount of labeled data, with a vast majority remaining unlabeled. Consequently, deep learning-based diagnostic models encounter the challenge of scarcity in labeled data and abundance in unlabeled data. Traditional semi-supervised deep learning methods based on pseudo-label self-training, while alleviating the issue of labeled data scarcity to some extent, neglect the reliability of pseudo-label information, the accuracy of feature extraction from unlabeled data, and the imbalance in sample selection. To address these issues, this paper proposes a novel semi-supervised fault diagnosis method under imbalanced unlabeled sample class information screening. Firstly, an information screening mechanism for unlabeled data based on active learning is established. This mechanism discriminates based on the variability of intrinsic feature information in fault samples, accurately screening out unlabeled samples located near decision boundaries that are difficult to separate clearly. Then, combining the maximum membership degree of these unlabeled data in the classification space of the supervised model and interacting with the active learning expert system, label information is assigned to the screened unlabeled data. Secondly, a cost-sensitive function driven by data imbalance is constructed to address the class imbalance problem in unlabeled sample screening, adaptively adjusting the weights of different class samples during model training to guide the training of the supervised model. Ultimately, through dynamic optimization of the supervised model and the feature extraction capability of unlabeled samples, the recognition ability of the diagnostic model for unlabeled samples is significantly enhanced. Validation through two datasets, encompassing a total of 12 experimental scenarios, demonstrates that in scenarios with only a small amount of labeled data, the proposed method achieves a diagnostic accuracy increment exceeding 10% compared to existing typical methods, fully validating the effectiveness and superiority of the proposed method in practical applications.\",\"PeriodicalId\":11694,\"journal\":{\"name\":\"Entropy\",\"volume\":\"27 2\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11854703/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Entropy\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.3390/e27020175\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PHYSICS, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Entropy","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.3390/e27020175","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PHYSICS, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

在机电传动系统的健康监测中，收集到的状态数据通常只包含少量的已标记数据，而绝大多数数据未标记。因此，基于深度学习的诊断模型遇到了标记数据稀缺和未标记数据丰富的挑战。传统的基于伪标签自训练的半监督深度学习方法在一定程度上缓解了标记数据稀缺性问题的同时，忽略了伪标签信息的可靠性、未标记数据特征提取的准确性以及样本选择的不平衡性。针对这些问题，本文提出了一种基于非平衡无标记样本类信息筛选的半监督故障诊断方法。首先，建立了一种基于主动学习的未标记数据信息筛选机制。该机制基于故障样本中固有特征信息的可变性进行判别，准确地筛选出位于决策边界附近难以清晰分离的未标记样本。然后，结合这些未标记数据在监督模型分类空间中的最大隶属度，与主动学习专家系统交互，为筛选出来的未标记数据分配标签信息。其次，构造数据不平衡驱动的代价敏感函数，解决无标记样本筛选中的类不平衡问题，在模型训练过程中自适应调整不同类样本的权值，指导监督模型的训练；最终，通过对监督模型的动态优化和对未标记样本的特征提取能力，显著增强了诊断模型对未标记样本的识别能力。通过两个数据集共12个实验场景的验证表明，在标记数据较少的场景下，与现有典型方法相比，本文方法的诊断准确率提高了10%以上，充分验证了本文方法在实际应用中的有效性和优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fault Diagnosis of Semi-Supervised Electromechanical Transmission Systems Under Imbalanced Unlabeled Sample Class Information Screening.

In the health monitoring of electromechanical transmission systems, the collected state data typically consist of only a minimal amount of labeled data, with a vast majority remaining unlabeled. Consequently, deep learning-based diagnostic models encounter the challenge of scarcity in labeled data and abundance in unlabeled data. Traditional semi-supervised deep learning methods based on pseudo-label self-training, while alleviating the issue of labeled data scarcity to some extent, neglect the reliability of pseudo-label information, the accuracy of feature extraction from unlabeled data, and the imbalance in sample selection. To address these issues, this paper proposes a novel semi-supervised fault diagnosis method under imbalanced unlabeled sample class information screening. Firstly, an information screening mechanism for unlabeled data based on active learning is established. This mechanism discriminates based on the variability of intrinsic feature information in fault samples, accurately screening out unlabeled samples located near decision boundaries that are difficult to separate clearly. Then, combining the maximum membership degree of these unlabeled data in the classification space of the supervised model and interacting with the active learning expert system, label information is assigned to the screened unlabeled data. Secondly, a cost-sensitive function driven by data imbalance is constructed to address the class imbalance problem in unlabeled sample screening, adaptively adjusting the weights of different class samples during model training to guide the training of the supervised model. Ultimately, through dynamic optimization of the supervised model and the feature extraction capability of unlabeled samples, the recognition ability of the diagnostic model for unlabeled samples is significantly enhanced. Validation through two datasets, encompassing a total of 12 experimental scenarios, demonstrates that in scenarios with only a small amount of labeled data, the proposed method achieves a diagnostic accuracy increment exceeding 10% compared to existing typical methods, fully validating the effectiveness and superiority of the proposed method in practical applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Entropy PHYSICS, MULTIDISCIPLINARY-

CiteScore

4.90

自引率

11.10%

发文量

1580

审稿时长

21.05 days

期刊介绍： Entropy (ISSN 1099-4300), an international and interdisciplinary journal of entropy and information studies, publishes reviews, regular research papers and short notes. Our aim is to encourage scientists to publish as much as possible their theoretical and experimental details. There is no restriction on the length of the papers. If there are computation and the experiment, the details must be provided so that the results can be reproduced.