{"title":"评估使用注意权重来解释基于bert的姿态分类","authors":"Carlos Abel Córdova Sáenz, Karin Becker","doi":"10.1145/3486622.3493966","DOIUrl":null,"url":null,"abstract":"BERT models are currently state-of-the-art solutions for various tasks, including stance classification. However, these models are a black box for their users. Some proposals have leveraged the weights assigned by the internal attention mechanisms of these models for interpretability purposes. However, whether the attention weights help the interpretability of the model is still a matter of debate, with positions in favor and against. This work proposes an attention-based interpretability mechanism to identify the most influential words for stances predicted using BERT-based models. We target stances expressed in Twitter using the Portuguese language and assess the proposed mechanism using a case study regarding stances on COVID-19 vaccination in the Brazilian context. The interpretation mechanism traces tokens’ attentions back to words, assigning a newly proposed metric referred to as absolute word attention. Through this metric, we assess several aspects to determine if we can find important words for the classification and with meaning for the domain. We developed a broad experimental setting that involved three datasets with tweets in Brazilian Portuguese and three BERT models with support for this language. Our results are encouraging, as we were able to identify 52-82% of words with high absolute attention contributing positively to stance classification. The interpretability mechanism proved to be helpful to understand the influence of words in the classification, and they revealed intrinsic properties of the domain and representative arguments of the stances.","PeriodicalId":89230,"journal":{"name":"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Assessing the use of attention weights to interpret BERT-based stance classification\",\"authors\":\"Carlos Abel Córdova Sáenz, Karin Becker\",\"doi\":\"10.1145/3486622.3493966\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"BERT models are currently state-of-the-art solutions for various tasks, including stance classification. However, these models are a black box for their users. Some proposals have leveraged the weights assigned by the internal attention mechanisms of these models for interpretability purposes. However, whether the attention weights help the interpretability of the model is still a matter of debate, with positions in favor and against. This work proposes an attention-based interpretability mechanism to identify the most influential words for stances predicted using BERT-based models. We target stances expressed in Twitter using the Portuguese language and assess the proposed mechanism using a case study regarding stances on COVID-19 vaccination in the Brazilian context. The interpretation mechanism traces tokens’ attentions back to words, assigning a newly proposed metric referred to as absolute word attention. Through this metric, we assess several aspects to determine if we can find important words for the classification and with meaning for the domain. We developed a broad experimental setting that involved three datasets with tweets in Brazilian Portuguese and three BERT models with support for this language. Our results are encouraging, as we were able to identify 52-82% of words with high absolute attention contributing positively to stance classification. The interpretability mechanism proved to be helpful to understand the influence of words in the classification, and they revealed intrinsic properties of the domain and representative arguments of the stances.\",\"PeriodicalId\":89230,\"journal\":{\"name\":\"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3486622.3493966\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3486622.3493966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Assessing the use of attention weights to interpret BERT-based stance classification
BERT models are currently state-of-the-art solutions for various tasks, including stance classification. However, these models are a black box for their users. Some proposals have leveraged the weights assigned by the internal attention mechanisms of these models for interpretability purposes. However, whether the attention weights help the interpretability of the model is still a matter of debate, with positions in favor and against. This work proposes an attention-based interpretability mechanism to identify the most influential words for stances predicted using BERT-based models. We target stances expressed in Twitter using the Portuguese language and assess the proposed mechanism using a case study regarding stances on COVID-19 vaccination in the Brazilian context. The interpretation mechanism traces tokens’ attentions back to words, assigning a newly proposed metric referred to as absolute word attention. Through this metric, we assess several aspects to determine if we can find important words for the classification and with meaning for the domain. We developed a broad experimental setting that involved three datasets with tweets in Brazilian Portuguese and three BERT models with support for this language. Our results are encouraging, as we were able to identify 52-82% of words with high absolute attention contributing positively to stance classification. The interpretability mechanism proved to be helpful to understand the influence of words in the classification, and they revealed intrinsic properties of the domain and representative arguments of the stances.