Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye
{"title":"基于自知识蒸馏和区域关注的RGB-D室内场景分析信息间和信息内平衡网络","authors":"Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye","doi":"10.1016/j.eswa.2025.127670","DOIUrl":null,"url":null,"abstract":"<div><div>Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics.<!--> <!-->The source code and results are available at <span><span>https://github.com/kolaloaver/IIBNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"282 ","pages":"Article 127670"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IIBNet: Inter- and Intra-Information balance network with Self-Knowledge distillation and region attention for RGB-D indoor scene parsing\",\"authors\":\"Yuming Zhang , Fangfang Qiang , Wujie Zhou , Weiqing Yan , Lv Ye\",\"doi\":\"10.1016/j.eswa.2025.127670\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics.<!--> <!-->The source code and results are available at <span><span>https://github.com/kolaloaver/IIBNet</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"282 \",\"pages\":\"Article 127670\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425012928\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012928","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
IIBNet: Inter- and Intra-Information balance network with Self-Knowledge distillation and region attention for RGB-D indoor scene parsing
Red–green–blue-depth (RGB-D) indoor scene parsing is a vital research topic in computer vision. Here, features acquired from different layers of the backbone are further processed to obtain a better prediction image. These images have different sizes and contain different information, but the labels used to supervise them are the same, resulting in large inconsistencies. Moreover, in training images, several categories are represented by a relatively small proportion of the total number of categories, resulting in poor training results. To address these problems, this article proposes an inter- and intra-information balance network (IIBNet). First, to offset the category imbalance within features, a region balance module using a region attention module is employed to merge four pairs of features obtained from the RGB and depth backbone networks, adjusting the proportion of different categories allocated in the feature. Second, to address the problem of feature-information imbalance across layers, information is transferred between two branches, reducing the diversity of information across different layers. The first branch is a channel-wise information-interaction branch, which employs self-knowledge distillation (Self-KD) as a tool for information transfer. Self-KD, in which the student and teacher networks are the same, allows features to learn from each other. The second branch is a spatial-wise information-interaction branch, which transfers the lowest-level feature information to the higher-level features. Based on extensive testing on two large indoor-scene-parsing datasets, IIBNet is observed to outperform state-of-the-art methods on three metrics. The source code and results are available at https://github.com/kolaloaver/IIBNet.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.