{"title":"VM Failure Prediction with Log Analysis using BERT-CNN Model","authors":"Sukhyun Nam, Jae-Hyoung Yoo, J. W. Hong","doi":"10.23919/CNSM55787.2022.9965187","DOIUrl":null,"url":null,"abstract":"In this study, we present a failure prediction study of VMs and VNFs in an NFV environment. For the proof of concept, we designed a machine learning model to predict the failure with log analysis and observed the cases where the failure-related logs do not exist in the failed VM, but in the server, or in other VMs operating on the same server. Therefore, in this paper, we propose a model which analyzes the logs of all the related VMs and the server and predicts the possibility that any of the VMs operating on the server will fail. To reduce the huge size of the logs collected from the server and VMs, we propose a pre-processing and tagging method that can improve the performance of our model. In addition, we designed a machine learning model using CNN with BERT, which has performed SOTA in various fields of NLP, to receive logs as input and calculate failure probabilities for the next 30 minutes. To validate the proposed model, we collected failure-related logs and normal logs from an OpenStack testbed, and the experimental result shows that the proposed model can predict the failure of VMs operating in the server with an F1 score of 0.74.","PeriodicalId":232521,"journal":{"name":"2022 18th International Conference on Network and Service Management (CNSM)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 18th International Conference on Network and Service Management (CNSM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/CNSM55787.2022.9965187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this study, we present a failure prediction study of VMs and VNFs in an NFV environment. For the proof of concept, we designed a machine learning model to predict the failure with log analysis and observed the cases where the failure-related logs do not exist in the failed VM, but in the server, or in other VMs operating on the same server. Therefore, in this paper, we propose a model which analyzes the logs of all the related VMs and the server and predicts the possibility that any of the VMs operating on the server will fail. To reduce the huge size of the logs collected from the server and VMs, we propose a pre-processing and tagging method that can improve the performance of our model. In addition, we designed a machine learning model using CNN with BERT, which has performed SOTA in various fields of NLP, to receive logs as input and calculate failure probabilities for the next 30 minutes. To validate the proposed model, we collected failure-related logs and normal logs from an OpenStack testbed, and the experimental result shows that the proposed model can predict the failure of VMs operating in the server with an F1 score of 0.74.