{"title":"Deeper and Broader Multimodal Fusion: Cascaded Forest-of-Experts for Land Cover Classification","authors":"Guangxia Wang;Kuiliang Gao;Xiong You","doi":"10.1109/LGRS.2024.3516854","DOIUrl":null,"url":null,"abstract":"Multimodal land cover classification (LCC) of optical and SAR images has become a research hotspot. However, there are still two unsolved problems: the lack of a deep fusion mechanism and the neglect of the diversity of multimodal features. Inspired by ensemble learning, this letter proposes the cascaded multimodal forest-of-experts (CM2FEs) for deeper and broader fusion to further improve the performance of LCC. The proposed method first establishes the expert tree, then combines multiple trees at the same level into a forest, and finally forms a cascaded forest across different levels. Specifically, the novel designs include three points: 1) the multimodal expert tree is built based on linear projection and dynamic routing, with multiple layers of experts; it can acquire more discriminative multimodal features through deeper fusion; 2) the cascaded forest is formed by combining expert trees at the same level and different levels, which can effectively ensemble the knowledge learned by different trees; it can generate more diverse multimodal features through broader fusion; and 3) two expert exchange strategies are proposed to transfer knowledge between different trees and further optimize the feature fusion effect. Experiments show that the proposed method performs better than existing methods, and the mean IoU (mIoU) has been improved by at least 1.60%–3.25%.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2024-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10806532/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Multimodal land cover classification (LCC) of optical and SAR images has become a research hotspot. However, there are still two unsolved problems: the lack of a deep fusion mechanism and the neglect of the diversity of multimodal features. Inspired by ensemble learning, this letter proposes the cascaded multimodal forest-of-experts (CM2FEs) for deeper and broader fusion to further improve the performance of LCC. The proposed method first establishes the expert tree, then combines multiple trees at the same level into a forest, and finally forms a cascaded forest across different levels. Specifically, the novel designs include three points: 1) the multimodal expert tree is built based on linear projection and dynamic routing, with multiple layers of experts; it can acquire more discriminative multimodal features through deeper fusion; 2) the cascaded forest is formed by combining expert trees at the same level and different levels, which can effectively ensemble the knowledge learned by different trees; it can generate more diverse multimodal features through broader fusion; and 3) two expert exchange strategies are proposed to transfer knowledge between different trees and further optimize the feature fusion effect. Experiments show that the proposed method performs better than existing methods, and the mean IoU (mIoU) has been improved by at least 1.60%–3.25%.