Xi Yang;Shaoyi Li;Saisai Niu;Binbin Yan;Zhongjie Meng
{"title":"Graph-Based Spatio-Temporal Semantic Reasoning Model for Anti-Occlusion Infrared Aerial Target Recognition","authors":"Xi Yang;Shaoyi Li;Saisai Niu;Binbin Yan;Zhongjie Meng","doi":"10.1109/TMM.2024.3408051","DOIUrl":null,"url":null,"abstract":"Infrared target recognition and anti-interference in complex battlefields is one of the key technologies enabling the precise strike capability of aircraft. Currently, infrared-guided aircraft face complex interference such as natural backgrounds and artificial decoys, leading to a decrease in the performance of infrared target recognition. A particular challenge to infrared target recognition and anti-interference capabilities is the strong interference situation caused by the combination of target maneuvering and the dense, continuous, and coordinated deployment of infrared decoys. To address extreme issues such as complete loss of target feature information and inability to identify due to target occlusion, we develop an anti-interference recognition method based on a visually inspired Spatio-Temporal Semantic Reasoning Model (STSRM). Firstly, inspired by the functional characteristics of visual semantic reasoning, the STSRM is proposed to simplify the reasoning of relationships among multiple regions into modeling relationships between corresponding region node features in a graph-based module. Secondly, an anti-occlusion target recognition model based on STSRM is constructed, which introduces a reasoning graph module connecting node regions to infer semantic information and predict targets between regions. The test results on the infrared dataset established in this paper indicate that the proposed anti-interference recognition model can make accurate target predictions in large-scale or full-occlusion conditions, and we achieve 13.9% and 3.1% improvement on mAP scores and mIoU scores, compared to current advanced method on our simulated infrared dataset.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"10530-10544"},"PeriodicalIF":8.4000,"publicationDate":"2024-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multimedia","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10543172/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared target recognition and anti-interference in complex battlefields is one of the key technologies enabling the precise strike capability of aircraft. Currently, infrared-guided aircraft face complex interference such as natural backgrounds and artificial decoys, leading to a decrease in the performance of infrared target recognition. A particular challenge to infrared target recognition and anti-interference capabilities is the strong interference situation caused by the combination of target maneuvering and the dense, continuous, and coordinated deployment of infrared decoys. To address extreme issues such as complete loss of target feature information and inability to identify due to target occlusion, we develop an anti-interference recognition method based on a visually inspired Spatio-Temporal Semantic Reasoning Model (STSRM). Firstly, inspired by the functional characteristics of visual semantic reasoning, the STSRM is proposed to simplify the reasoning of relationships among multiple regions into modeling relationships between corresponding region node features in a graph-based module. Secondly, an anti-occlusion target recognition model based on STSRM is constructed, which introduces a reasoning graph module connecting node regions to infer semantic information and predict targets between regions. The test results on the infrared dataset established in this paper indicate that the proposed anti-interference recognition model can make accurate target predictions in large-scale or full-occlusion conditions, and we achieve 13.9% and 3.1% improvement on mAP scores and mIoU scores, compared to current advanced method on our simulated infrared dataset.
期刊介绍:
The IEEE Transactions on Multimedia delves into diverse aspects of multimedia technology and applications, covering circuits, networking, signal processing, systems, software, and systems integration. The scope aligns with the Fields of Interest of the sponsors, ensuring a comprehensive exploration of research in multimedia.