利用特征分解增强物体检测变换器的可解释性

Wenlong Yu;Ruonan Liu;Dongyue Chen;Qinghua Hu
{"title":"利用特征分解增强物体检测变换器的可解释性","authors":"Wenlong Yu;Ruonan Liu;Dongyue Chen;Qinghua Hu","doi":"10.1109/TIP.2024.3492733","DOIUrl":null,"url":null,"abstract":"Explainability is a pivotal factor in determining whether a deep learning model can be authorized in critical applications. To enhance the explainability of models of end-to-end object DEtection with TRansformer (DETR), we introduce a disentanglement method that constrains the feature learning process, following a divide-and-conquer decoupling paradigm, similar to how people understand complex real-world problems. We first demonstrate the entangled property of the features between the extractor and detector and find that the regression function is a key factor contributing to the deterioration of disentangled feature activation. These highly entangled features always activate the local characteristics, making it difficult to cover the semantic information of an object, which also reduces the interpretability of single-backbone object detection models. Thus, an Explainability Enhanced object detection Transformer with feature Disentanglement (DETD) model is proposed, in which the Tensor Singular Value Decomposition (T-SVD) is used to produce feature bases and the Batch averaged Feature Spectral Penalization (BFSP) loss is introduced to constrain the disentanglement of the feature and balance the semantic activation. The proposed method is applied across three prominent backbones, two DETR variants, and a CNN based model. By combining two optimization techniques, extensive experiments on two datasets consistently demonstrate that the DETD model outperforms the counterpart in terms of object detection performance and feature disentanglement. The Grad-CAM visualizations demonstrate the enhancement of feature learning explainability in the disentanglement view.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"33 ","pages":"6439-6454"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Explainability Enhanced Object Detection Transformer With Feature Disentanglement\",\"authors\":\"Wenlong Yu;Ruonan Liu;Dongyue Chen;Qinghua Hu\",\"doi\":\"10.1109/TIP.2024.3492733\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Explainability is a pivotal factor in determining whether a deep learning model can be authorized in critical applications. To enhance the explainability of models of end-to-end object DEtection with TRansformer (DETR), we introduce a disentanglement method that constrains the feature learning process, following a divide-and-conquer decoupling paradigm, similar to how people understand complex real-world problems. We first demonstrate the entangled property of the features between the extractor and detector and find that the regression function is a key factor contributing to the deterioration of disentangled feature activation. These highly entangled features always activate the local characteristics, making it difficult to cover the semantic information of an object, which also reduces the interpretability of single-backbone object detection models. Thus, an Explainability Enhanced object detection Transformer with feature Disentanglement (DETD) model is proposed, in which the Tensor Singular Value Decomposition (T-SVD) is used to produce feature bases and the Batch averaged Feature Spectral Penalization (BFSP) loss is introduced to constrain the disentanglement of the feature and balance the semantic activation. The proposed method is applied across three prominent backbones, two DETR variants, and a CNN based model. By combining two optimization techniques, extensive experiments on two datasets consistently demonstrate that the DETD model outperforms the counterpart in terms of object detection performance and feature disentanglement. The Grad-CAM visualizations demonstrate the enhancement of feature learning explainability in the disentanglement view.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"33 \",\"pages\":\"6439-6454\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10751766/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10751766/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

可解释性是决定深度学习模型能否在关键应用中获得授权的关键因素。为了提高端到端对象检测与转换器(DETR)模型的可解释性,我们引入了一种解缠方法,该方法遵循分而治之的解耦范式,限制了特征学习过程,类似于人们理解复杂现实世界问题的方式。我们首先展示了提取器和检测器之间特征的纠缠特性,并发现回归函数是导致解纠缠特征激活恶化的关键因素。这些高度纠缠的特征总是激活局部特征,难以涵盖物体的语义信息,这也降低了单骨干物体检测模型的可解释性。因此,本文提出了一种带特征解缠的可解释性增强物体检测变换器(DETD)模型,其中使用张量奇异值分解(T-SVD)来生成特征基,并引入批量平均特征谱惩罚(BFSP)损失来约束特征的解缠并平衡语义激活。所提出的方法适用于三个突出的骨干网、两个 DETR 变体和一个基于 CNN 的模型。通过结合两种优化技术,在两个数据集上进行的大量实验一致表明,DETD 模型在物体检测性能和特征解缠方面优于对应模型。Grad-CAM 可视化展示了在解缠视图中特征学习可解释性的增强。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Explainability Enhanced Object Detection Transformer With Feature Disentanglement
Explainability is a pivotal factor in determining whether a deep learning model can be authorized in critical applications. To enhance the explainability of models of end-to-end object DEtection with TRansformer (DETR), we introduce a disentanglement method that constrains the feature learning process, following a divide-and-conquer decoupling paradigm, similar to how people understand complex real-world problems. We first demonstrate the entangled property of the features between the extractor and detector and find that the regression function is a key factor contributing to the deterioration of disentangled feature activation. These highly entangled features always activate the local characteristics, making it difficult to cover the semantic information of an object, which also reduces the interpretability of single-backbone object detection models. Thus, an Explainability Enhanced object detection Transformer with feature Disentanglement (DETD) model is proposed, in which the Tensor Singular Value Decomposition (T-SVD) is used to produce feature bases and the Batch averaged Feature Spectral Penalization (BFSP) loss is introduced to constrain the disentanglement of the feature and balance the semantic activation. The proposed method is applied across three prominent backbones, two DETR variants, and a CNN based model. By combining two optimization techniques, extensive experiments on two datasets consistently demonstrate that the DETD model outperforms the counterpart in terms of object detection performance and feature disentanglement. The Grad-CAM visualizations demonstrate the enhancement of feature learning explainability in the disentanglement view.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信