Xingyuan Ren, Lin Zhang, Kunpeng Xie, Qiankun Dong
{"title":"A Parallel Approach of Weighted Edit Distance Calculation for Log Parsing","authors":"Xingyuan Ren, Lin Zhang, Kunpeng Xie, Qiankun Dong","doi":"10.1109/CCET48361.2019.8989069","DOIUrl":null,"url":null,"abstract":"For modern software systems, larger numbers of log massages have been generated every day. By analyzing these log messages with vital information such as exception reports, developers can manage and monitor software systems efficiently. Each log message in the log file consists of a fixed part (template) and a variable part, and the fixed parts of log messages with one event type are the same, while the variable part are different. LKE (Log Key Extraction), a widely used log parser for analyzing log messages, can find the fixed parts efficiently, due to the cluster strategy base on the calculation of weighted edit distance between log messages. However, it is time-consuming to calculate the weighted edit distance for large scale log files. In this paper, we proposed a parallel approach using a unique hierarchical index structure to calculate the weighted edit distance on GPU (Graph Processing Unit). GPU has an advantage of high parallelism and is suitable for intensive computing, therefore, the time required to process large-scale logs could be reduced by this approach. Experiments show that LKE parser using GPU to calculate the weighted edit distance has high efficiency and accuracy in the HDFS data set and the marine information data set.","PeriodicalId":231425,"journal":{"name":"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 2nd International Conference on Computer and Communication Engineering Technology (CCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCET48361.2019.8989069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
For modern software systems, larger numbers of log massages have been generated every day. By analyzing these log messages with vital information such as exception reports, developers can manage and monitor software systems efficiently. Each log message in the log file consists of a fixed part (template) and a variable part, and the fixed parts of log messages with one event type are the same, while the variable part are different. LKE (Log Key Extraction), a widely used log parser for analyzing log messages, can find the fixed parts efficiently, due to the cluster strategy base on the calculation of weighted edit distance between log messages. However, it is time-consuming to calculate the weighted edit distance for large scale log files. In this paper, we proposed a parallel approach using a unique hierarchical index structure to calculate the weighted edit distance on GPU (Graph Processing Unit). GPU has an advantage of high parallelism and is suitable for intensive computing, therefore, the time required to process large-scale logs could be reduced by this approach. Experiments show that LKE parser using GPU to calculate the weighted edit distance has high efficiency and accuracy in the HDFS data set and the marine information data set.