Amit Dhurandhar;Tejaswini Pedapati;Avinash Balakrishnan;Pin-Yu Chen;Karthikeyan Shanmugam;Ruchir Puri
{"title":"分类模型与模型无关的对比解释","authors":"Amit Dhurandhar;Tejaswini Pedapati;Avinash Balakrishnan;Pin-Yu Chen;Karthikeyan Shanmugam;Ruchir Puri","doi":"10.1109/JETCAS.2024.3486114","DOIUrl":null,"url":null,"abstract":"Extensive surveys on explanations that are suitable for humans, claims that an explanation being contrastive is one of its most important traits. A few methods have been proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model. In this work, we propose a method, Model Agnostic Contrastive Explanations Method (MACEM), that can generate contrastive explanations for any classification model where one is able to only query the class probabilities for a desired input. This allows us to generate contrastive explanations for not only neural networks, but also models such as random forests, boosted trees and even arbitrary ensembles that are still amongst the state-of-the-art when learning on tabular data. Our method is also applicable to the scenarios where only the black-box access of the model is provided, implying that we can only obtain the predictions and prediction probabilities. With the advent of larger models, it is increasingly prevalent to be working in the black-box scenario, where the user will not necessarily have access to the model weights or parameters, and will only be able to interact with the model using an API. As such, to obtain meaningful explanations we propose a principled and scalable approach to handle real and categorical features leading to novel formulations for computing pertinent positives and negatives that form the essence of a contrastive explanation. A detailed treatment of this nature where we focus on scalability and handle different data types was not performed in the previous work, which assumed all features to be positive real valued with zero being indicative of the least interesting value. We part with this strong implicit assumption and generalize these methods so as to be applicable across a much wider range of problem settings. We quantitatively as well as qualitatively validate our approach over public datasets covering diverse domains.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 4","pages":"789-798"},"PeriodicalIF":3.7000,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model Agnostic Contrastive Explanations for Classification Models\",\"authors\":\"Amit Dhurandhar;Tejaswini Pedapati;Avinash Balakrishnan;Pin-Yu Chen;Karthikeyan Shanmugam;Ruchir Puri\",\"doi\":\"10.1109/JETCAS.2024.3486114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extensive surveys on explanations that are suitable for humans, claims that an explanation being contrastive is one of its most important traits. A few methods have been proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model. In this work, we propose a method, Model Agnostic Contrastive Explanations Method (MACEM), that can generate contrastive explanations for any classification model where one is able to only query the class probabilities for a desired input. This allows us to generate contrastive explanations for not only neural networks, but also models such as random forests, boosted trees and even arbitrary ensembles that are still amongst the state-of-the-art when learning on tabular data. Our method is also applicable to the scenarios where only the black-box access of the model is provided, implying that we can only obtain the predictions and prediction probabilities. With the advent of larger models, it is increasingly prevalent to be working in the black-box scenario, where the user will not necessarily have access to the model weights or parameters, and will only be able to interact with the model using an API. As such, to obtain meaningful explanations we propose a principled and scalable approach to handle real and categorical features leading to novel formulations for computing pertinent positives and negatives that form the essence of a contrastive explanation. A detailed treatment of this nature where we focus on scalability and handle different data types was not performed in the previous work, which assumed all features to be positive real valued with zero being indicative of the least interesting value. We part with this strong implicit assumption and generalize these methods so as to be applicable across a much wider range of problem settings. We quantitatively as well as qualitatively validate our approach over public datasets covering diverse domains.\",\"PeriodicalId\":48827,\"journal\":{\"name\":\"IEEE Journal on Emerging and Selected Topics in Circuits and Systems\",\"volume\":\"14 4\",\"pages\":\"789-798\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2024-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal on Emerging and Selected Topics in Circuits and Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10734168/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10734168/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
摘要
关于适合人类的解释的大量调查表明,解释的对比性是其最重要的特征之一。目前已经提出了几种方法来为可微分模型(如深度神经网络)生成对比性解释,在这种情况下,人们可以完全访问模型。在这项工作中,我们提出了一种名为 "模型不可知性对比解释法"(MACEM)的方法,它可以为任何分类模型生成对比解释,在这种模型中,人们只能查询所需输入的类概率。这使我们不仅能为神经网络生成对比性解释,还能为随机森林、提升树等模型生成对比性解释,甚至还能为任意集合生成对比性解释。我们的方法也适用于只提供模型黑箱访问的情况,这意味着我们只能获得预测结果和预测概率。随着大型模型的出现,在黑箱场景下工作的情况越来越普遍,在这种情况下,用户不一定能访问模型权重或参数,只能通过 API 与模型进行交互。因此,为了获得有意义的解释,我们提出了一种原则性的、可扩展的方法来处理真实的和分类的特征,从而得出新的公式来计算相关的正面和负面特征,这些特征构成了对比解释的本质。前人的研究假设所有特征都是正实值,零表示最不感兴趣的值。我们摒弃了这一强烈的隐含假设,将这些方法加以推广,使其适用于更广泛的问题设置。我们在涵盖不同领域的公共数据集上对我们的方法进行了定量和定性验证。
Model Agnostic Contrastive Explanations for Classification Models
Extensive surveys on explanations that are suitable for humans, claims that an explanation being contrastive is one of its most important traits. A few methods have been proposed to generate contrastive explanations for differentiable models such as deep neural networks, where one has complete access to the model. In this work, we propose a method, Model Agnostic Contrastive Explanations Method (MACEM), that can generate contrastive explanations for any classification model where one is able to only query the class probabilities for a desired input. This allows us to generate contrastive explanations for not only neural networks, but also models such as random forests, boosted trees and even arbitrary ensembles that are still amongst the state-of-the-art when learning on tabular data. Our method is also applicable to the scenarios where only the black-box access of the model is provided, implying that we can only obtain the predictions and prediction probabilities. With the advent of larger models, it is increasingly prevalent to be working in the black-box scenario, where the user will not necessarily have access to the model weights or parameters, and will only be able to interact with the model using an API. As such, to obtain meaningful explanations we propose a principled and scalable approach to handle real and categorical features leading to novel formulations for computing pertinent positives and negatives that form the essence of a contrastive explanation. A detailed treatment of this nature where we focus on scalability and handle different data types was not performed in the previous work, which assumed all features to be positive real valued with zero being indicative of the least interesting value. We part with this strong implicit assumption and generalize these methods so as to be applicable across a much wider range of problem settings. We quantitatively as well as qualitatively validate our approach over public datasets covering diverse domains.
期刊介绍:
The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.