{"title":"多变量时间序列机器学习中的硬盘故障预测挑战","authors":"Jie Yu","doi":"10.1145/3373419.3373437","DOIUrl":null,"url":null,"abstract":"Hard disk drive failure prediction (HDDFP) is an active area of machine learning applications. While recent work shows very promising results with high failure recall (95%) and precision based on SMART attributes, challenges remain that call for improvement in the machine learning pipeline. This paper starts with an introduction of the topic and a summary of recent work. Some challenges applicable to the existing solutions are then illustrated with an example using Backblaze dataset and its HDDFP rule. A main result of the paper is a rigorous formulation of the HDDFP problem as a MIMO dynamic system problem to tackle the challenges. It is also shown that the general formulation can help the existing classification method by enhancing the prediction lead time requirement. Though presented in the context of the HDDFP problem, the findings and thought process are applicable to other dynamic system failure prediction, and in some degree to the IoT and time series based analytics in general.","PeriodicalId":352528,"journal":{"name":"Proceedings of the 2019 3rd International Conference on Advances in Image Processing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Hard disk Drive Failure Prediction Challenges in Machine Learning for Multi-variate Time Series\",\"authors\":\"Jie Yu\",\"doi\":\"10.1145/3373419.3373437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hard disk drive failure prediction (HDDFP) is an active area of machine learning applications. While recent work shows very promising results with high failure recall (95%) and precision based on SMART attributes, challenges remain that call for improvement in the machine learning pipeline. This paper starts with an introduction of the topic and a summary of recent work. Some challenges applicable to the existing solutions are then illustrated with an example using Backblaze dataset and its HDDFP rule. A main result of the paper is a rigorous formulation of the HDDFP problem as a MIMO dynamic system problem to tackle the challenges. It is also shown that the general formulation can help the existing classification method by enhancing the prediction lead time requirement. Though presented in the context of the HDDFP problem, the findings and thought process are applicable to other dynamic system failure prediction, and in some degree to the IoT and time series based analytics in general.\",\"PeriodicalId\":352528,\"journal\":{\"name\":\"Proceedings of the 2019 3rd International Conference on Advances in Image Processing\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 3rd International Conference on Advances in Image Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3373419.3373437\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd International Conference on Advances in Image Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3373419.3373437","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hard disk Drive Failure Prediction Challenges in Machine Learning for Multi-variate Time Series
Hard disk drive failure prediction (HDDFP) is an active area of machine learning applications. While recent work shows very promising results with high failure recall (95%) and precision based on SMART attributes, challenges remain that call for improvement in the machine learning pipeline. This paper starts with an introduction of the topic and a summary of recent work. Some challenges applicable to the existing solutions are then illustrated with an example using Backblaze dataset and its HDDFP rule. A main result of the paper is a rigorous formulation of the HDDFP problem as a MIMO dynamic system problem to tackle the challenges. It is also shown that the general formulation can help the existing classification method by enhancing the prediction lead time requirement. Though presented in the context of the HDDFP problem, the findings and thought process are applicable to other dynamic system failure prediction, and in some degree to the IoT and time series based analytics in general.