基于自知识蒸馏和区域关注的RGB-D室内场景分析信息间和信息内平衡网络

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-04-17 DOI:10.1016/j.eswa.2025.127670

Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye

{"title":"基于自知识蒸馏和区域关注的RGB-D室内场景分析信息间和信息内平衡网络","authors":"Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye","doi":"10.1016/j.eswa.2025.127670","DOIUrl":null,"url":null,"abstract":"<div><div>Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics.The source code and results are available at <span><span>https://github.com/kolaloaver/IIBNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127670"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IIBNet: Inter- and Intra-Information balance network with Self-Knowledge distillation and region attention for RGB-D indoor scene parsing\",\"authors\":\"Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye\",\"doi\":\"10.1016/j.eswa.2025.127670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics.The source code and results are available at <span><span>https://github.com/kolaloaver/IIBNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"282 \",\"pages\":\"Article 127670\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425012928\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012928","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

红-绿-蓝-深（RGB-D）室内场景解析是计算机视觉领域的一个重要研究课题。在这里，从不同骨干层获取的特征要经过进一步处理，以获得更好的预测图像。这些图像大小不同，包含的信息也不同，但用于监督它们的标签却是相同的，这就造成了很大的不一致性。此外，在训练图像中，几个类别所占的比例相对较小，导致训练效果不佳。针对这些问题，本文提出了一种信息间和信息内平衡网络（IIBNet）。首先，为了抵消特征内部类别不平衡的问题，采用了一个区域平衡模块，利用区域关注模块将从 RGB 和深度骨干网络中获得的四对特征进行合并，调整不同类别在特征中的分配比例。其次，为了解决跨层特征信息不平衡的问题，信息在两个分支之间传递，减少了不同层信息的多样性。第一个分支是渠道式信息交互分支，采用自我知识提炼（Self-KD）作为信息传递工具。自我知识提炼（Self-KD）是指学生网络和教师网络是相同的，这样就能实现特征之间的相互学习。第二个分支是空间信息交互分支，它将最低级的特征信息传递给更高级的特征。基于对两个大型室内场景解析数据集的广泛测试，IIBNet 在三个指标上都优于最先进的方法。源代码和结果见 https://github.com/kolaloaver/IIBNet。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

IIBNet: Inter- and Intra-Information balance network with Self-Knowledge distillation and region attention for RGB-D indoor scene parsing

Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics. The source code and results are available at https://github.com/kolaloaver/IIBNet.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.