Enhancing object recognition: The role of object knowledge decomposition and component-labeled datasets

IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Nuoye Xiong , Ning Wang , Hongsheng Li , Guangming Zhu , Liang Zhang , Syed Afaq Ali Shah , Mohammed Bennamoun
{"title":"Enhancing object recognition: The role of object knowledge decomposition and component-labeled datasets","authors":"Nuoye Xiong ,&nbsp;Ning Wang ,&nbsp;Hongsheng Li ,&nbsp;Guangming Zhu ,&nbsp;Liang Zhang ,&nbsp;Syed Afaq Ali Shah ,&nbsp;Mohammed Bennamoun","doi":"10.1016/j.neucom.2024.128969","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning models’ decision-making processes can be elusive, often raising concerns about their reliability. To address this, we have introduced the Object Knowledge Decomposition and Components Label Dataset (OKD-CL), designed to improve the interpretability and accuracy of object recognition models. This dataset includes 99 categories from PartImageNet, each detailed with clear physical structures that align with human visual concepts. In a hierarchical structure, every category is described by Abstract Component Knowledge (ACK) descriptions and each image instance comes with Explicit Visual Knowledge (EVK) masks, highlighting the visual components’ appearance. By evaluating multiple deep neural networks guided with ACK and EVK (dual-knowledge-guidance approach), we saw better accuracy and a higher Foreground Reasoning Ratio (FRR), confirming our knowledge-guided method’s effectiveness. When used on the Hard-ImageNet dataset, this approach reduced the model’s reliance on incorrect feature assumptions without sacrificing classification accuracy. This hierarchical comprehension encouraged by OKD-CL is crucial in minimizing incorrect feature associations and strengthening model robustness. The entire code and dataset are available on: <span><span>https://github.com/XiGuaBo/OKD-CL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"617 ","pages":"Article 128969"},"PeriodicalIF":5.5000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231224017405","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Deep learning models’ decision-making processes can be elusive, often raising concerns about their reliability. To address this, we have introduced the Object Knowledge Decomposition and Components Label Dataset (OKD-CL), designed to improve the interpretability and accuracy of object recognition models. This dataset includes 99 categories from PartImageNet, each detailed with clear physical structures that align with human visual concepts. In a hierarchical structure, every category is described by Abstract Component Knowledge (ACK) descriptions and each image instance comes with Explicit Visual Knowledge (EVK) masks, highlighting the visual components’ appearance. By evaluating multiple deep neural networks guided with ACK and EVK (dual-knowledge-guidance approach), we saw better accuracy and a higher Foreground Reasoning Ratio (FRR), confirming our knowledge-guided method’s effectiveness. When used on the Hard-ImageNet dataset, this approach reduced the model’s reliance on incorrect feature assumptions without sacrificing classification accuracy. This hierarchical comprehension encouraged by OKD-CL is crucial in minimizing incorrect feature associations and strengthening model robustness. The entire code and dataset are available on: https://github.com/XiGuaBo/OKD-CL.
增强对象识别:对象知识分解和组件标记数据集的作用
深度学习模型的决策过程可能难以捉摸,这常常引发人们对其可靠性的担忧。为了解决这个问题,我们引入了对象知识分解和组件标签数据集(OKD-CL),旨在提高对象识别模型的可解释性和准确性。该数据集包括来自PartImageNet的99个类别,每个类别都有清晰的物理结构,与人类的视觉概念保持一致。在层次结构中,每个类别都通过抽象组件知识(ACK)描述来描述,每个图像实例都带有显式视觉知识(EVK)掩码,突出显示视觉组件的外观。通过对ACK和EVK(双知识引导方法)引导的多个深度神经网络进行评估,我们看到了更好的准确率和更高的前景推理比率(FRR),证实了我们的知识引导方法的有效性。当在Hard-ImageNet数据集上使用时,这种方法减少了模型对不正确特征假设的依赖,而不会牺牲分类精度。OKD-CL鼓励的这种分层理解对于最小化不正确的特征关联和增强模型鲁棒性至关重要。完整的代码和数据集可在:https://github.com/XiGuaBo/OKD-CL。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Neurocomputing
Neurocomputing 工程技术-计算机:人工智能
CiteScore
13.10
自引率
10.00%
发文量
1382
审稿时长
70 days
期刊介绍: Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信