{"title":"A survey of methods and tools used for interpreting Random Forest","authors":"Maissae Haddouchi, A. Berrado","doi":"10.1109/ICSSD47982.2019.9002770","DOIUrl":null,"url":null,"abstract":"Interpretability of highly performant Machine Learning [ML] methods, such as Random Forest [RF], is a key tool that attracts a great interest in datamining research. In the state of the art, RF is well-known as an efficient ensemble learning (in terms of predictive accuracy, flexibility and straightforwardness). Moreover, it is recognized as an intuitive and intelligible approach regarding to its building process. However it is also regarded as a Black Box model because of its hundreds of deep decision trees. This can be crucial for several fields of study, such as healthcare, biology and security, where the lack of interpretability could be a real disadvantage. Indeed, the interpretability of the RF models is, generally, necessary in such fields of applications because of different motivations. In fact, the more the ML users grasp what is going on inside a ML system (process and resulting model), the more they can trust it and take actions based on the knowledge extracted from it. Furthermore, ML models are increasingly constrained by new laws that require regulation and interpretation of the knowledge they provide.Several papers have tackled the interpretation of RF resulting models. It had been associated with different aspects depending on the specificity of the issue studied as well as the users concerned with explanations. Therefore, this paper aims to provide a survey of tools and methods used in literature in order to uncover insights in the RF resulting models. These tools are classified depending on different aspects characterizing the interpretability. This should guide, in practice, in the choice of the most useful tools for interpretation and deep analysis of the RF model depending on the interpretability aspect sought. This should also be valuable for researchers who aim to focus their work on the interpretability of RF, or ML in general.","PeriodicalId":342806,"journal":{"name":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSD47982.2019.9002770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Interpretability of highly performant Machine Learning [ML] methods, such as Random Forest [RF], is a key tool that attracts a great interest in datamining research. In the state of the art, RF is well-known as an efficient ensemble learning (in terms of predictive accuracy, flexibility and straightforwardness). Moreover, it is recognized as an intuitive and intelligible approach regarding to its building process. However it is also regarded as a Black Box model because of its hundreds of deep decision trees. This can be crucial for several fields of study, such as healthcare, biology and security, where the lack of interpretability could be a real disadvantage. Indeed, the interpretability of the RF models is, generally, necessary in such fields of applications because of different motivations. In fact, the more the ML users grasp what is going on inside a ML system (process and resulting model), the more they can trust it and take actions based on the knowledge extracted from it. Furthermore, ML models are increasingly constrained by new laws that require regulation and interpretation of the knowledge they provide.Several papers have tackled the interpretation of RF resulting models. It had been associated with different aspects depending on the specificity of the issue studied as well as the users concerned with explanations. Therefore, this paper aims to provide a survey of tools and methods used in literature in order to uncover insights in the RF resulting models. These tools are classified depending on different aspects characterizing the interpretability. This should guide, in practice, in the choice of the most useful tools for interpretation and deep analysis of the RF model depending on the interpretability aspect sought. This should also be valuable for researchers who aim to focus their work on the interpretability of RF, or ML in general.