{"title":"GenLog:准确的日志模板发现剥离X86二进制文件","authors":"Maosheng Zhang, Ying Zhao, Zengmingyu He","doi":"10.1109/COMPSAC.2017.137","DOIUrl":null,"url":null,"abstract":"Log analysis plays an important role for computer failure diagnosis. With the ever increasing size and complexity of logs, the task of analyzing logs has become cumbersome to carry out manually. For this reason, recent research has focused on automatic analysis techniques for large log files. However, log messages are texts with certain formats and it is very challenging for automatic analysis to understand the semantic meanings of log messages. The current state-of-the-art approaches depend on the quality of observed log messages or source code producing these log messages. In this paper, we propose a method GenLog that can extract log templates from stripped executables (neither source code nor debugging information need to be available). GenLog finds all log related functions in a binary through a combined bottom-up and top down slicing method, reconstructs the memory buffers where log messages were constructeStripped X86 Binaries d, and identifies components of log messages using data flow analysis and taint propagation analysis. GenLog can be used to analyze large binary code, and is suitable for commercial off-the-shelf (COTS) software or dynamic libraries. We evaluated GenLog on four X86 executables and one of them is Nginx. The experiments show that GenLog can identify the template for log messages in testing log files with a precision of 99.9%.","PeriodicalId":6556,"journal":{"name":"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)","volume":"24 1","pages":"337-346"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"GenLog: Accurate Log Template Discovery for Stripped X86 Binaries\",\"authors\":\"Maosheng Zhang, Ying Zhao, Zengmingyu He\",\"doi\":\"10.1109/COMPSAC.2017.137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Log analysis plays an important role for computer failure diagnosis. With the ever increasing size and complexity of logs, the task of analyzing logs has become cumbersome to carry out manually. For this reason, recent research has focused on automatic analysis techniques for large log files. However, log messages are texts with certain formats and it is very challenging for automatic analysis to understand the semantic meanings of log messages. The current state-of-the-art approaches depend on the quality of observed log messages or source code producing these log messages. In this paper, we propose a method GenLog that can extract log templates from stripped executables (neither source code nor debugging information need to be available). GenLog finds all log related functions in a binary through a combined bottom-up and top down slicing method, reconstructs the memory buffers where log messages were constructeStripped X86 Binaries d, and identifies components of log messages using data flow analysis and taint propagation analysis. GenLog can be used to analyze large binary code, and is suitable for commercial off-the-shelf (COTS) software or dynamic libraries. We evaluated GenLog on four X86 executables and one of them is Nginx. The experiments show that GenLog can identify the template for log messages in testing log files with a precision of 99.9%.\",\"PeriodicalId\":6556,\"journal\":{\"name\":\"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)\",\"volume\":\"24 1\",\"pages\":\"337-346\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COMPSAC.2017.137\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COMPSAC.2017.137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
GenLog: Accurate Log Template Discovery for Stripped X86 Binaries
Log analysis plays an important role for computer failure diagnosis. With the ever increasing size and complexity of logs, the task of analyzing logs has become cumbersome to carry out manually. For this reason, recent research has focused on automatic analysis techniques for large log files. However, log messages are texts with certain formats and it is very challenging for automatic analysis to understand the semantic meanings of log messages. The current state-of-the-art approaches depend on the quality of observed log messages or source code producing these log messages. In this paper, we propose a method GenLog that can extract log templates from stripped executables (neither source code nor debugging information need to be available). GenLog finds all log related functions in a binary through a combined bottom-up and top down slicing method, reconstructs the memory buffers where log messages were constructeStripped X86 Binaries d, and identifies components of log messages using data flow analysis and taint propagation analysis. GenLog can be used to analyze large binary code, and is suitable for commercial off-the-shelf (COTS) software or dynamic libraries. We evaluated GenLog on four X86 executables and one of them is Nginx. The experiments show that GenLog can identify the template for log messages in testing log files with a precision of 99.9%.