基于词嵌入和自关注机制的多模态遥感图像描述

Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv
{"title":"基于词嵌入和自关注机制的多模态遥感图像描述","authors":"Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv","doi":"10.1109/ISASS.2019.8757726","DOIUrl":null,"url":null,"abstract":"Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.","PeriodicalId":359959,"journal":{"name":"2019 3rd International Symposium on Autonomous Systems (ISAS)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Multi-modal Remote Sensing Image Description Based on Word Embedding and Self-Attention Mechanism\",\"authors\":\"Yuan Wang, Kuerban Alifu, Hongbing Ma, Junli Li, U. Halik, Yalong Lv\",\"doi\":\"10.1109/ISASS.2019.8757726\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.\",\"PeriodicalId\":359959,\"journal\":{\"name\":\"2019 3rd International Symposium on Autonomous Systems (ISAS)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 3rd International Symposium on Autonomous Systems (ISAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISASS.2019.8757726\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 3rd International Symposium on Autonomous Systems (ISAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISASS.2019.8757726","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

传统的多模态模型在描述和识别微波图像中待识别物体时,对复杂图像内容的描述相对较弱,生成的句子相对简单。本文结合Ngram 2vec词嵌入技术,提出了一种基于自注意机制的多模态遥感语义描述与识别方法。首先,使用Ngram2ve挖掘域窗口中待识别像素与相邻像素之间的语义信息和上下文特征;其次,引入自关注机制,进一步学习邻域窗口内所有像素点的内部结构信息,生成多维表示;最后,为了避免层间传输的信息丢失,采用密集网络实现信息流集成,并在每个密集连接模块之间加入多层独立递归神经网络解决梯度消失问题。实验结果表明,该方法在图像描述和识别方面优于传统的深度学习方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-modal Remote Sensing Image Description Based on Word Embedding and Self-Attention Mechanism
Traditional multi-modal models are relatively weak in describing complex image content when describing and identifying objects to be identified in microwave images, the generated sentences by which are relatively simple. In this paper, a multimodal remote sensing semantic description and recognition method based on self-attention mechanism is proposed, which combined with the Ngram 2vec word embedding technique. Firstly, Ngram2ve is used to mine the semantic information and context features between the pixels to be identified in the domain window and adjacent pixels. Secondly, a self-attention mechanism is introduced to further learn the internal structure information of all pixels in the neighborhood window to generate a multidimensional representation. Finally, in order to avoid the loss of information transmitted between layers, Dense nets are used to implement information flow integration, and a multi-layered independent recurrent neural network is added between each densely connected module to solve the gradient disappearance. Experimental results show that this method is superior to traditional deep learning methods in image description and recognition.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信