{"title":"基于nlp的系统日志模板生成算法研究","authors":"Satoru Kobayashi, K. Fukuda, H. Esaki","doi":"10.1145/2619287.2619290","DOIUrl":null,"url":null,"abstract":"System log from network equipment is one of the most important information for network management. Sophisticated log message mining could help in investigating a huge number of log messages for trouble shooting, especially in recent complicated network structure (e.g., virtualized networks). However, generating log templates (i.e., meta format) from real log messages (instances) is still difficult problem in terms of accuracy. In this paper we propose a Natural Language Processing (NLP) approach to generate log templates from log messages produced by network equipment in order to overcome this problem. The key idea of the work is to leverage the use of Conditional Random Fields (CRF), a well-studied supervised natural language processing technique. As preliminarily evaluation, with one month network equipment logs in a Japanese academic network, we show that our CRF based algorithm improves the accuracy of generated log templates in reasonable processing time, compared with a traditional method.","PeriodicalId":409750,"journal":{"name":"International Conference of Future Internet","volume":"475 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Towards an NLP-based log template generation algorithm for system log analysis\",\"authors\":\"Satoru Kobayashi, K. Fukuda, H. Esaki\",\"doi\":\"10.1145/2619287.2619290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"System log from network equipment is one of the most important information for network management. Sophisticated log message mining could help in investigating a huge number of log messages for trouble shooting, especially in recent complicated network structure (e.g., virtualized networks). However, generating log templates (i.e., meta format) from real log messages (instances) is still difficult problem in terms of accuracy. In this paper we propose a Natural Language Processing (NLP) approach to generate log templates from log messages produced by network equipment in order to overcome this problem. The key idea of the work is to leverage the use of Conditional Random Fields (CRF), a well-studied supervised natural language processing technique. As preliminarily evaluation, with one month network equipment logs in a Japanese academic network, we show that our CRF based algorithm improves the accuracy of generated log templates in reasonable processing time, compared with a traditional method.\",\"PeriodicalId\":409750,\"journal\":{\"name\":\"International Conference of Future Internet\",\"volume\":\"475 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference of Future Internet\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2619287.2619290\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference of Future Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2619287.2619290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards an NLP-based log template generation algorithm for system log analysis
System log from network equipment is one of the most important information for network management. Sophisticated log message mining could help in investigating a huge number of log messages for trouble shooting, especially in recent complicated network structure (e.g., virtualized networks). However, generating log templates (i.e., meta format) from real log messages (instances) is still difficult problem in terms of accuracy. In this paper we propose a Natural Language Processing (NLP) approach to generate log templates from log messages produced by network equipment in order to overcome this problem. The key idea of the work is to leverage the use of Conditional Random Fields (CRF), a well-studied supervised natural language processing technique. As preliminarily evaluation, with one month network equipment logs in a Japanese academic network, we show that our CRF based algorithm improves the accuracy of generated log templates in reasonable processing time, compared with a traditional method.