Hendrik Engbers, A. Alla, Markus Kreutz, M. Freitag
{"title":"Applicability of Algorithm Evaluation Metrics for Predictive Maintenance in Production Systems","authors":"Hendrik Engbers, A. Alla, Markus Kreutz, M. Freitag","doi":"10.1109/CiSt49399.2021.9357277","DOIUrl":null,"url":null,"abstract":"Algorithm evaluation metrics are used to measure and compare the performance of diagnostic and prognostic algorithms. However, there is no consensus on which of the various metrics are the most suitable for classifiers' performance evaluation. This paper examines the applicability of common evaluation metrics for predictive maintenance applications in production systems. It is intended to clarify (1) which metrics are well suited for this use case, and (2) whether they are sufficient as a sole decision criterion for algorithm selection. Further, (3) the significance of evaluation metrics concerning the practical impact on the production system's performance is investigated. Moreover, we analyze (4) how increasing the production system's complexity affects the correlation between algorithm evaluation metrics and performance. We conducted 960 simulation runs of a flexible flow shop production system to examine the impact of machine breakdowns and different failure distributions with varying confusion matrices. In addition, two different configurations of the production system, which differ in system complexity, were investigated. Eventually, we determined the product-moment correlation coefficient between common evaluation metrics and the production system's performance. The simulation results reveal that metrics that are sensitive to false negative values (FN), like the False Omission Rate (FOR), are very well suited. Nevertheless, a detailed analysis shows that even (FOR) can only provide reliable statements combined with other metrics. We found that as the system becomes more complex, the informative value of the metrics in terms of their impact on the production system's performance decreases. Finally, we propose parameters that could be relevant for developing new metrics considering the current system configuration.","PeriodicalId":253233,"journal":{"name":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CiSt49399.2021.9357277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Algorithm evaluation metrics are used to measure and compare the performance of diagnostic and prognostic algorithms. However, there is no consensus on which of the various metrics are the most suitable for classifiers' performance evaluation. This paper examines the applicability of common evaluation metrics for predictive maintenance applications in production systems. It is intended to clarify (1) which metrics are well suited for this use case, and (2) whether they are sufficient as a sole decision criterion for algorithm selection. Further, (3) the significance of evaluation metrics concerning the practical impact on the production system's performance is investigated. Moreover, we analyze (4) how increasing the production system's complexity affects the correlation between algorithm evaluation metrics and performance. We conducted 960 simulation runs of a flexible flow shop production system to examine the impact of machine breakdowns and different failure distributions with varying confusion matrices. In addition, two different configurations of the production system, which differ in system complexity, were investigated. Eventually, we determined the product-moment correlation coefficient between common evaluation metrics and the production system's performance. The simulation results reveal that metrics that are sensitive to false negative values (FN), like the False Omission Rate (FOR), are very well suited. Nevertheless, a detailed analysis shows that even (FOR) can only provide reliable statements combined with other metrics. We found that as the system becomes more complex, the informative value of the metrics in terms of their impact on the production system's performance decreases. Finally, we propose parameters that could be relevant for developing new metrics considering the current system configuration.