{"title":"随机森林和变化点检测在大规模系统中根本原因定位中的应用","authors":"Dhan V. Sagar, P. Sivakumar, R. V. Anand","doi":"10.1109/ICCIC.2014.7238442","DOIUrl":null,"url":null,"abstract":"Identification of root causes of a performance problem is very difficult in case of large scale IT environment. A model which is scalable and reasonably accurate is required for such complex scenarios. This paper proposes a hybrid model using random forest and statistical change point detection, for root cause localization. Based on impurity measure and change in error rates, random forest identifies the features which can be a potential cause for the problem. Since it is a tree based approach, it does not require any prior information about the measured features. To reduce the number of false classifications, a second level of selection using change point analysis is done. The ability of random forest to work well on very large dataset makes the solution scalable and accurate. Proposed model is applied and verified by identifying the root causes for Service Level Objective Violations in enterprise IT systems.","PeriodicalId":187874,"journal":{"name":"2014 IEEE International Conference on Computational Intelligence and Computing Research","volume":"402 1-6","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Random forest and change point detection for root cause localization in large scale systems\",\"authors\":\"Dhan V. Sagar, P. Sivakumar, R. V. Anand\",\"doi\":\"10.1109/ICCIC.2014.7238442\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Identification of root causes of a performance problem is very difficult in case of large scale IT environment. A model which is scalable and reasonably accurate is required for such complex scenarios. This paper proposes a hybrid model using random forest and statistical change point detection, for root cause localization. Based on impurity measure and change in error rates, random forest identifies the features which can be a potential cause for the problem. Since it is a tree based approach, it does not require any prior information about the measured features. To reduce the number of false classifications, a second level of selection using change point analysis is done. The ability of random forest to work well on very large dataset makes the solution scalable and accurate. Proposed model is applied and verified by identifying the root causes for Service Level Objective Violations in enterprise IT systems.\",\"PeriodicalId\":187874,\"journal\":{\"name\":\"2014 IEEE International Conference on Computational Intelligence and Computing Research\",\"volume\":\"402 1-6\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Computational Intelligence and Computing Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIC.2014.7238442\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Computational Intelligence and Computing Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIC.2014.7238442","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Random forest and change point detection for root cause localization in large scale systems
Identification of root causes of a performance problem is very difficult in case of large scale IT environment. A model which is scalable and reasonably accurate is required for such complex scenarios. This paper proposes a hybrid model using random forest and statistical change point detection, for root cause localization. Based on impurity measure and change in error rates, random forest identifies the features which can be a potential cause for the problem. Since it is a tree based approach, it does not require any prior information about the measured features. To reduce the number of false classifications, a second level of selection using change point analysis is done. The ability of random forest to work well on very large dataset makes the solution scalable and accurate. Proposed model is applied and verified by identifying the root causes for Service Level Objective Violations in enterprise IT systems.