基于词嵌入和自关注机制的多模态遥感图像描述

2019 3rd International Symposium on Autonomous Systems (ISAS) Pub Date : 2019-05-01 DOI:10.1109/ISASS.2019.8757726

Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv

{"title":"基于词嵌入和自关注机制的多模态遥感图像描述","authors":"Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv","doi":"10.1109/ISASS.2019.8757726","DOIUrl":null,"url":null,"abstract":"Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.","PeriodicalId":359959,"journal":{"name":"2019 3rd International Symposium on Autonomous Systems (ISAS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-modal Remote Sensing Image Description Based on Word Embedding and Self-Attention Mechanism\",\"authors\":\"Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv\",\"doi\":\"10.1109/ISASS.2019.8757726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.\",\"PeriodicalId\":359959,\"journal\":{\"name\":\"2019 3rd International Symposium on Autonomous Systems (ISAS)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 3rd International Symposium on Autonomous Systems (ISAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISASS.2019.8757726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 3rd International Symposium on Autonomous Systems (ISAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISASS.2019.8757726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

传统的多模态模型在描述和识别微波图像中待识别物体时，对复杂图像内容的描述相对较弱，生成的句子相对简单。本文结合Ngram 2vec词嵌入技术，提出了一种基于自注意机制的多模态遥感语义描述与识别方法。首先，使用Ngram2ve挖掘域窗口中待识别像素与相邻像素之间的语义信息和上下文特征;其次，引入自关注机制，进一步学习邻域窗口内所有像素点的内部结构信息，生成多维表示;最后，为了避免层间传输的信息丢失，采用密集网络实现信息流集成，并在每个密集连接模块之间加入多层独立递归神经网络解决梯度消失问题。实验结果表明，该方法在图像描述和识别方面优于传统的深度学习方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-modal Remote Sensing Image Description Based on Word Embedding and Self-Attention Mechanism

Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 3rd International Symposium on Autonomous Systems (ISAS)

自引率

0.00%

发文量