{"title":"以基因预测为例理解离散分类器","authors":"M. Subianto, A. Siebes","doi":"10.1109/ICDM.2007.40","DOIUrl":null,"url":null,"abstract":"The requirement that the models resulting from data mining should be understandable is an uncontroversial requirement. In the data mining literature, however, it plays hardly any role, if at all. In practice, though, understandability is often even more important than, e.g., accuracy. Understandability does not mean that models should be simple. It means that one should be able to understand the predictions of models. In this paper we introduce tools to understand arbitrary classifiers defined on discrete data. More in particular, we introduce Explanations that provide insight at a local level. They explain why a classifier classifies a data point as it does. For global insight, we introduce attribute weights. The higher the weight of an attribute, the more often it is decisive in the classification of a data point. To illustrate our tools, we describe a case study in the prediction of small genes. This is a notoriously hard problem in bioinformatics.","PeriodicalId":233758,"journal":{"name":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Understanding Discrete Classifiers with a Case Study in Gene Prediction\",\"authors\":\"M. Subianto, A. Siebes\",\"doi\":\"10.1109/ICDM.2007.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The requirement that the models resulting from data mining should be understandable is an uncontroversial requirement. In the data mining literature, however, it plays hardly any role, if at all. In practice, though, understandability is often even more important than, e.g., accuracy. Understandability does not mean that models should be simple. It means that one should be able to understand the predictions of models. In this paper we introduce tools to understand arbitrary classifiers defined on discrete data. More in particular, we introduce Explanations that provide insight at a local level. They explain why a classifier classifies a data point as it does. For global insight, we introduce attribute weights. The higher the weight of an attribute, the more often it is decisive in the classification of a data point. To illustrate our tools, we describe a case study in the prediction of small genes. This is a notoriously hard problem in bioinformatics.\",\"PeriodicalId\":233758,\"journal\":{\"name\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2007.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2007.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Understanding Discrete Classifiers with a Case Study in Gene Prediction
The requirement that the models resulting from data mining should be understandable is an uncontroversial requirement. In the data mining literature, however, it plays hardly any role, if at all. In practice, though, understandability is often even more important than, e.g., accuracy. Understandability does not mean that models should be simple. It means that one should be able to understand the predictions of models. In this paper we introduce tools to understand arbitrary classifiers defined on discrete data. More in particular, we introduce Explanations that provide insight at a local level. They explain why a classifier classifies a data point as it does. For global insight, we introduce attribute weights. The higher the weight of an attribute, the more often it is decisive in the classification of a data point. To illustrate our tools, we describe a case study in the prediction of small genes. This is a notoriously hard problem in bioinformatics.