利用Kullback-Leibler散度和Kolmogorov-Smirnov检验选择基于CNN模型的故障诊断问题的输入大小

Learning and Nonlinear Models Pub Date : 2021-06-30 DOI:10.21528/LNLM-VOL18-NO2-ART2

R. Monteiro, C. Bastos-Filho, M. Cerrada, Diego Cabrera, Réne-Vinicio Sánchez

{"title":"利用Kullback-Leibler散度和Kolmogorov-Smirnov检验选择基于CNN模型的故障诊断问题的输入大小","authors":"R. Monteiro, C. Bastos-Filho, M. Cerrada, Diego Cabrera, Réne-Vinicio Sánchez","doi":"10.21528/LNLM-VOL18-NO2-ART2","DOIUrl":null,"url":null,"abstract":"Choosing a suitable size for signal representations, e.g., frequency spectra, in a given machine learning problem is not a trivial task. It may strongly affect the performance of the trained models. Many solutions have been proposed to solve this problem. Most of them rely on designing an optimized input or selecting the most suitable input according to an exhaustive search. In this work, we used the Kullback-Leibler Divergence and the Kolmogorov-Smirnov Test to measure the dissimilarity among signal representations belonging to equal and different classes, i.e., we measured the intraclass and interclass dissimilarities. Moreover, we analyzed how this information relates to the classifier performance. The results suggested that both the interclass and intraclass dissimilarities were related to the model accuracy since they indicate how easy a model can learn discriminative information from the input data. The highest ratios between the average interclass and intraclass dissimilarities were related to the most accurate classifiers. We can use this information to select a suitable input size to train the classification model. The approach was tested on two data sets related to the fault diagnosis of reciprocating compressors.","PeriodicalId":386768,"journal":{"name":"Learning and Nonlinear Models","volume":"1044 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using the Kullback-Leibler Divergence and Kolmogorov-Smirnov Test to Select Input Sizes to the Fault Diagnosis Problem Based on a CNN Model\",\"authors\":\"R. Monteiro, C. Bastos-Filho, M. Cerrada, Diego Cabrera, Réne-Vinicio Sánchez\",\"doi\":\"10.21528/LNLM-VOL18-NO2-ART2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Choosing a suitable size for signal representations, e.g., frequency spectra, in a given machine learning problem is not a trivial task. It may strongly affect the performance of the trained models. Many solutions have been proposed to solve this problem. Most of them rely on designing an optimized input or selecting the most suitable input according to an exhaustive search. In this work, we used the Kullback-Leibler Divergence and the Kolmogorov-Smirnov Test to measure the dissimilarity among signal representations belonging to equal and different classes, i.e., we measured the intraclass and interclass dissimilarities. Moreover, we analyzed how this information relates to the classifier performance. The results suggested that both the interclass and intraclass dissimilarities were related to the model accuracy since they indicate how easy a model can learn discriminative information from the input data. The highest ratios between the average interclass and intraclass dissimilarities were related to the most accurate classifiers. We can use this information to select a suitable input size to train the classification model. The approach was tested on two data sets related to the fault diagnosis of reciprocating compressors.\",\"PeriodicalId\":386768,\"journal\":{\"name\":\"Learning and Nonlinear Models\",\"volume\":\"1044 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Learning and Nonlinear Models\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21528/LNLM-VOL18-NO2-ART2\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning and Nonlinear Models","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21528/LNLM-VOL18-NO2-ART2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在给定的机器学习问题中，为信号表示(例如频谱)选择合适的大小并不是一项简单的任务。它可能会强烈地影响训练模型的性能。已经提出了许多解决这个问题的办法。它们大多依赖于设计一个优化的输入或根据穷举搜索选择最合适的输入。在这项工作中，我们使用Kullback-Leibler散度和Kolmogorov-Smirnov检验来衡量属于相等和不同类别的信号表示之间的不相似性，即我们测量了类别内和类别间的不相似性。此外，我们还分析了这些信息与分类器性能的关系。结果表明，类间和类内差异都与模型精度有关，因为它们表明模型从输入数据中学习判别信息的难易程度。类间和类内平均差异的最高比率与最准确的分类器有关。我们可以利用这些信息来选择合适的输入大小来训练分类模型。在两个与往复式压缩机故障诊断相关的数据集上对该方法进行了测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using the Kullback-Leibler Divergence and Kolmogorov-Smirnov Test to Select Input Sizes to the Fault Diagnosis Problem Based on a CNN Model

Choosing a suitable size for signal representations, e.g., frequency spectra, in a given machine learning problem is not a trivial task. It may strongly affect the performance of the trained models. Many solutions have been proposed to solve this problem. Most of them rely on designing an optimized input or selecting the most suitable input according to an exhaustive search. In this work, we used the Kullback-Leibler Divergence and the Kolmogorov-Smirnov Test to measure the dissimilarity among signal representations belonging to equal and different classes, i.e., we measured the intraclass and interclass dissimilarities. Moreover, we analyzed how this information relates to the classifier performance. The results suggested that both the interclass and intraclass dissimilarities were related to the model accuracy since they indicate how easy a model can learn discriminative information from the input data. The highest ratios between the average interclass and intraclass dissimilarities were related to the most accurate classifiers. We can use this information to select a suitable input size to train the classification model. The approach was tested on two data sets related to the fault diagnosis of reciprocating compressors.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Learning and Nonlinear Models

自引率

0.00%

发文量