{"title":"The Class Overlap Model for System Log Anomaly Detection Based on Ensemble Learning","authors":"Yitong Ren, Zhaojun Gu, Lanlan Pan, Chunbo Liu","doi":"10.1109/DSC50466.2020.00064","DOIUrl":null,"url":null,"abstract":"Using machine learning to detect system log data is essential. It is prone to the phenomenon of class overlap because of too many similar system log data. The occurrence of this phenomenon will have a serious impact on the anomaly detection of the system logs. In order to solve the problem of class overlap in system logs, this paper proposes an anomaly detection model for class overlap on system logs. We first calculate the relationship between the sample data and the membership of different classes, normal or anomaly, and use the fuzziness to separate the sample data of the overlapping parts of the classes from the data of the other parts. AdaBoost, an ensemble learning approach, is used to detect overlapping data. Compared with machine learning algorithms, ensemble learning can better classify the data of the overlapping parts, so as to achieve the purpose of detecting the anomalies of the system logs. Experimental results show that our model can be effectively applied in a variety of basic algorithms, and the results of each measure have been improved.","PeriodicalId":423182,"journal":{"name":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSC50466.2020.00064","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Using machine learning to detect system log data is essential. It is prone to the phenomenon of class overlap because of too many similar system log data. The occurrence of this phenomenon will have a serious impact on the anomaly detection of the system logs. In order to solve the problem of class overlap in system logs, this paper proposes an anomaly detection model for class overlap on system logs. We first calculate the relationship between the sample data and the membership of different classes, normal or anomaly, and use the fuzziness to separate the sample data of the overlapping parts of the classes from the data of the other parts. AdaBoost, an ensemble learning approach, is used to detect overlapping data. Compared with machine learning algorithms, ensemble learning can better classify the data of the overlapping parts, so as to achieve the purpose of detecting the anomalies of the system logs. Experimental results show that our model can be effectively applied in a variety of basic algorithms, and the results of each measure have been improved.