DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications

Annual Meeting of the Association for Computational Linguistics Pub Date : 2023-07-05 DOI:10.48550/arXiv.2307.02094

Adam Ivankay, Mattia Rigotti, P. Frossard

{"title":"DARE: Towards Robust Text Explanations in Biomedical and Healthcare Applications","authors":"Adam Ivankay, Mattia Rigotti, P. Frossard","doi":"10.48550/arXiv.2307.02094","DOIUrl":null,"url":null,"abstract":"Along with the successful deployment of deep neural networks in several application domains, the need to unravel the black-box nature of these networks has seen a significant increase recently. Several methods have been introduced to provide insight into the inference process of deep neural networks. However, most of these explainability methods have been shown to be brittle in the face of adversarial perturbations of their inputs in the image and generic textual domain. In this work we show that this phenomenon extends to specific and important high stakes domains like biomedical datasets. In particular, we observe that the robustness of explanations should be characterized in terms of the accuracy of the explanation in linking a model’s inputs and its decisions - faithfulness - and its relevance from the perspective of domain experts - plausibility. This is crucial to prevent explanations that are inaccurate but still look convincing in the context of the domain at hand. To this end, we show how to adapt current attribution robustness estimation methods to a given domain, so as to take into account domain-specific plausibility. This results in our DomainAdaptiveAREstimator (DARE) attribution robustness estimator, allowing us to properly characterize the domain-specific robustness of faithful explanations. Next, we provide two methods, adversarial training and FAR training, to mitigate the brittleness characterized by DARE, allowing us to train networks that display robust attributions. Finally, we empirically validate our methods with extensive experiments on three established biomedical benchmarks.","PeriodicalId":352845,"journal":{"name":"Annual Meeting of the Association for Computational Linguistics","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annual Meeting of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2307.02094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Along with the successful deployment of deep neural networks in several application domains, the need to unravel the black-box nature of these networks has seen a significant increase recently. Several methods have been introduced to provide insight into the inference process of deep neural networks. However, most of these explainability methods have been shown to be brittle in the face of adversarial perturbations of their inputs in the image and generic textual domain. In this work we show that this phenomenon extends to specific and important high stakes domains like biomedical datasets. In particular, we observe that the robustness of explanations should be characterized in terms of the accuracy of the explanation in linking a model’s inputs and its decisions - faithfulness - and its relevance from the perspective of domain experts - plausibility. This is crucial to prevent explanations that are inaccurate but still look convincing in the context of the domain at hand. To this end, we show how to adapt current attribution robustness estimation methods to a given domain, so as to take into account domain-specific plausibility. This results in our DomainAdaptiveAREstimator (DARE) attribution robustness estimator, allowing us to properly characterize the domain-specific robustness of faithful explanations. Next, we provide two methods, adversarial training and FAR training, to mitigate the brittleness characterized by DARE, allowing us to train networks that display robust attributions. Finally, we empirically validate our methods with extensive experiments on three established biomedical benchmarks.

查看原文本刊更多论文

DARE:在生物医学和医疗保健应用中实现健壮的文本解释

随着深度神经网络在多个应用领域的成功部署，解开这些网络黑箱性质的需求最近显著增加。介绍了几种方法来深入了解深度神经网络的推理过程。然而，大多数这些可解释性方法已被证明是脆弱的，面对其在图像和一般文本域的输入的对抗性扰动。在这项工作中，我们表明这种现象扩展到特定和重要的高风险领域，如生物医学数据集。特别是，我们观察到，解释的稳健性应该以连接模型输入和决策的解释的准确性(忠实性)和从领域专家的角度来看的相关性(合理性)为特征。这对于防止解释不准确，但在当前领域的上下文中看起来仍然令人信服是至关重要的。为此，我们展示了如何使当前的属性鲁棒性估计方法适应给定的领域，从而考虑到特定领域的合理性。这产生了我们的domainadaptiveareestimator (DARE)属性稳健性估计器，允许我们正确地描述忠实解释的特定领域稳健性。接下来，我们提供了两种方法，对抗性训练和FAR训练，以减轻DARE特征的脆弱性，使我们能够训练出显示鲁棒属性的网络。最后，我们在三个已建立的生物医学基准上进行了广泛的实验，以经验验证我们的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annual Meeting of the Association for Computational Linguistics

自引率

0.00%

发文量