Shi Zhenghao, Wu Chenwei, Li Chengjian, You Zhenzhen, Wang Quan, Ma Chengcheng
{"title":"Object detection techniques based on deep learning for aerial remote sensing images:a survey","authors":"Shi Zhenghao, Wu Chenwei, Li Chengjian, You Zhenzhen, Wang Quan, Ma Chengcheng","doi":"10.11834/jig.221085","DOIUrl":null,"url":null,"abstract":"航空遥感图像目标检测旨在定位和识别遥感图像中感兴趣的目标,是航空遥感图像智能解译的关键技术,在情报侦察、灾害救援和资源勘探等领域具有重要应用价值。然而由于航空遥感图像具有尺寸大、目标小且密集、目标呈任意角度分布、目标易被遮挡、目标类别不均衡以及背景复杂等诸多特点,航空遥感图像目标检测目前仍然是极具挑战的任务。基于深度卷积神经网络的航空遥感图像目标检测方法因具有精度高、处理速度快等优点,受到了越来越多的关注。为推进基于深度学习的航空遥感图像目标检测技术的发展,本文对当前主流遥感图像目标检测方法,特别是 2020-2022 年提出的检测方法,进行了系统梳理和总结。首先梳理了基于深度学习目标检测方法的研究发展演化过程,然后对基于卷积神经网络和基于 Transformer 目标检测方法中的代表性算法进行分析总结,再后针对不同遥感图象应用场景的改进方法思路进行归纳,分析了典型算法的思路和特点,介绍了现有的公开航空遥感图像目标检测数据集,给出了典型算法的实验比较结果,最后给出现阶段航空遥感图像目标检测研究中所存在的问题,并对未来研究及发展趋势进行了展望。;Given the successful development of aerospace technology, high-resolution remote-sensing images have been used in daily research.The earlier low-resolution images limit researchers'interpretation of image information.In comparison, today's high-resolution remote sensing images contain rich geographic and entity detail features.They are also rich in spatial structure and semantic information.Thus, they can greatly promote the development of research in this field.Aerial remote sensing image object detection aims to provide the category and location of the target of interest in aerial remote sensing images and present evidence for further information interpretation reasoning.This technology is crucial for aerial remote sensing image interpretation and has important applications in intelligence reconnaissance, target surveillance, and disaster rescue.The early remote sensing image object detection task mainly relies on manual interpretation.The interpretation results are greatly affected by subjective factors, such as the experience and energy of the interpreters.Moreover, the timeliness is low.Various remote sensing image object detection methods based on machine learning technology have been proposed with the progress and development of machine learning technology.Traditional machine learning-based object detection techniques generally use manually designed models to extract feature information, such as feature spectrum, gray value, texture, and shape of remote sensing images, after generating sliding windows.Then, they feed the extracted feature information into classifiers, such as support vector machine(SVM)and adaptive boosting(AdaBoost), to achieve object detection in remote sensing images.These methods design the corresponding feature extraction models for specific targets with strong interpretability but weak feature expression capability, poor generalization, time-consuming computation, and low accuracy.These features make meeting the needs of accurate and efficient object detection tasks challenging in complex and variable application scenarios.In recent years, the research on the application of deep learning in remote sensing image processing has received considerable attention and become a hotspot because of the wide application of deep learning techniques, such as deep convolutional neural networks and generative adversarial neural networks, in the fields of natural image object detection, classification, and recognition, and the excellent performance in the task of large-scale natural scene image object detection.Thus, many excellent works have emerged.Object detection in aerial remote sensing images mainly faces challenges, such as large-size and high-resolution images, interference from complex backgrounds, target direction diversity, dense targets, dramatic scale changes, and small targets.At present, these challenges have corresponding model improvement methods.For large-scale natural scene image object detection, high-resolution aerial remote sensing images are used because the target scale in the image is widely distributed.This approach ensures the integrity of small target detail information.Thus, the most commonly used detection and recognition method involves segmenting the image during data preprocessing;that is, the large image is segmented into regular image sizes and sent to the object detection algorithm for detection and recognition in turn.In the subsequent processing, all the detection results are finally stitched together and reset to complete the detection of the whole image.Moreover, the aerial remote sensing image with the ultrahigh resolution has a complex background.The target to be detected is easily interfered with by various similar objects, and the similar targets to be detected present different characteristics.Thus, false detection quickly occurs during detection.Therefore, the usual methods for solving complex background interference can be divided into two types:extracting the contextual information in the image and improving the attention mechanism.The targets to be detected in the images for the complex multidirectional and multitarget situations are multidirectional because the aerial remote sensing images are all top-down images.Moreover, the aspect ratio range of the targets to be detected is more diverse than that of the targets in the natural images.Thus, the interference between the targets is serious, thereby affecting the accuracy of the final target localization and classification.At present, three practical improvement ideas are available for the problems of directional diversity and dense arrangement distribution of targets to be detected:image rotation enhancement, design of rotation invariant module, and design of an accurate position regression method.The designed model needs to have good scale invariance, i.e., the model has high recognition ability even under the drastic changes of multiple scales of multiple targets, to meet the challenge of drastic changes in the target scales in aerial remote sensing images.Thus, the common improvement scheme is the multiscale feature fusion.For the small target detection in aerial remote sensing images, the current algorithms are mainly improved from feature enhancement, multilevel feature map detection, and the design of precise positioning strategies.In summary, the challenges and difficulties of object detection in aerial remote sensing imagery do not exist independently.For example, the large size and high resolution of aerial remote sensing images inevitably lead to a complex background in the images and a sharp increase in the category and number of small targets to be detected.Moreover, most of the small targets are susceptible to strong interference from the complex background.This phenomenon results in localization and classification recognition accuracy.In addition, the improvements for one challenge also apply to other difficulties, e.g., the improvements for multiscale target feature enhancement benefit almost all challenges.Therefore, the problems in the field must be analyzed and improved from a global perspective.Based on the full study of the latest reviews and related research works, this study systematically compares and summarizes deep learning object detection algorithms for aerial remote sensing images, particularly the research methods at home and abroad in the past three years, to provide appropriate object detection research for aerial remote sensing images and help scholars comprehensively understand and grasp the latest progress in aerial remote sensing image object detection research based on deep learning.First, the present study introduces the deep-learning-based image object detection model.Then, it systematically composes the deep-learning-based aerial remote sensing image detection methods, introduces the publicly available datasets for aerial remote sensing image object detection, and compares the performances of typical methods through experiments.Finally, the problems in the current research of aerial remote sensing image object detection are presented, and future research and development trends are prospected.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"中国图象图形学报","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11834/jig.221085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
Abstract
航空遥感图像目标检测旨在定位和识别遥感图像中感兴趣的目标,是航空遥感图像智能解译的关键技术,在情报侦察、灾害救援和资源勘探等领域具有重要应用价值。然而由于航空遥感图像具有尺寸大、目标小且密集、目标呈任意角度分布、目标易被遮挡、目标类别不均衡以及背景复杂等诸多特点,航空遥感图像目标检测目前仍然是极具挑战的任务。基于深度卷积神经网络的航空遥感图像目标检测方法因具有精度高、处理速度快等优点,受到了越来越多的关注。为推进基于深度学习的航空遥感图像目标检测技术的发展,本文对当前主流遥感图像目标检测方法,特别是 2020-2022 年提出的检测方法,进行了系统梳理和总结。首先梳理了基于深度学习目标检测方法的研究发展演化过程,然后对基于卷积神经网络和基于 Transformer 目标检测方法中的代表性算法进行分析总结,再后针对不同遥感图象应用场景的改进方法思路进行归纳,分析了典型算法的思路和特点,介绍了现有的公开航空遥感图像目标检测数据集,给出了典型算法的实验比较结果,最后给出现阶段航空遥感图像目标检测研究中所存在的问题,并对未来研究及发展趋势进行了展望。;Given the successful development of aerospace technology, high-resolution remote-sensing images have been used in daily research.The earlier low-resolution images limit researchers'interpretation of image information.In comparison, today's high-resolution remote sensing images contain rich geographic and entity detail features.They are also rich in spatial structure and semantic information.Thus, they can greatly promote the development of research in this field.Aerial remote sensing image object detection aims to provide the category and location of the target of interest in aerial remote sensing images and present evidence for further information interpretation reasoning.This technology is crucial for aerial remote sensing image interpretation and has important applications in intelligence reconnaissance, target surveillance, and disaster rescue.The early remote sensing image object detection task mainly relies on manual interpretation.The interpretation results are greatly affected by subjective factors, such as the experience and energy of the interpreters.Moreover, the timeliness is low.Various remote sensing image object detection methods based on machine learning technology have been proposed with the progress and development of machine learning technology.Traditional machine learning-based object detection techniques generally use manually designed models to extract feature information, such as feature spectrum, gray value, texture, and shape of remote sensing images, after generating sliding windows.Then, they feed the extracted feature information into classifiers, such as support vector machine(SVM)and adaptive boosting(AdaBoost), to achieve object detection in remote sensing images.These methods design the corresponding feature extraction models for specific targets with strong interpretability but weak feature expression capability, poor generalization, time-consuming computation, and low accuracy.These features make meeting the needs of accurate and efficient object detection tasks challenging in complex and variable application scenarios.In recent years, the research on the application of deep learning in remote sensing image processing has received considerable attention and become a hotspot because of the wide application of deep learning techniques, such as deep convolutional neural networks and generative adversarial neural networks, in the fields of natural image object detection, classification, and recognition, and the excellent performance in the task of large-scale natural scene image object detection.Thus, many excellent works have emerged.Object detection in aerial remote sensing images mainly faces challenges, such as large-size and high-resolution images, interference from complex backgrounds, target direction diversity, dense targets, dramatic scale changes, and small targets.At present, these challenges have corresponding model improvement methods.For large-scale natural scene image object detection, high-resolution aerial remote sensing images are used because the target scale in the image is widely distributed.This approach ensures the integrity of small target detail information.Thus, the most commonly used detection and recognition method involves segmenting the image during data preprocessing;that is, the large image is segmented into regular image sizes and sent to the object detection algorithm for detection and recognition in turn.In the subsequent processing, all the detection results are finally stitched together and reset to complete the detection of the whole image.Moreover, the aerial remote sensing image with the ultrahigh resolution has a complex background.The target to be detected is easily interfered with by various similar objects, and the similar targets to be detected present different characteristics.Thus, false detection quickly occurs during detection.Therefore, the usual methods for solving complex background interference can be divided into two types:extracting the contextual information in the image and improving the attention mechanism.The targets to be detected in the images for the complex multidirectional and multitarget situations are multidirectional because the aerial remote sensing images are all top-down images.Moreover, the aspect ratio range of the targets to be detected is more diverse than that of the targets in the natural images.Thus, the interference between the targets is serious, thereby affecting the accuracy of the final target localization and classification.At present, three practical improvement ideas are available for the problems of directional diversity and dense arrangement distribution of targets to be detected:image rotation enhancement, design of rotation invariant module, and design of an accurate position regression method.The designed model needs to have good scale invariance, i.e., the model has high recognition ability even under the drastic changes of multiple scales of multiple targets, to meet the challenge of drastic changes in the target scales in aerial remote sensing images.Thus, the common improvement scheme is the multiscale feature fusion.For the small target detection in aerial remote sensing images, the current algorithms are mainly improved from feature enhancement, multilevel feature map detection, and the design of precise positioning strategies.In summary, the challenges and difficulties of object detection in aerial remote sensing imagery do not exist independently.For example, the large size and high resolution of aerial remote sensing images inevitably lead to a complex background in the images and a sharp increase in the category and number of small targets to be detected.Moreover, most of the small targets are susceptible to strong interference from the complex background.This phenomenon results in localization and classification recognition accuracy.In addition, the improvements for one challenge also apply to other difficulties, e.g., the improvements for multiscale target feature enhancement benefit almost all challenges.Therefore, the problems in the field must be analyzed and improved from a global perspective.Based on the full study of the latest reviews and related research works, this study systematically compares and summarizes deep learning object detection algorithms for aerial remote sensing images, particularly the research methods at home and abroad in the past three years, to provide appropriate object detection research for aerial remote sensing images and help scholars comprehensively understand and grasp the latest progress in aerial remote sensing image object detection research based on deep learning.First, the present study introduces the deep-learning-based image object detection model.Then, it systematically composes the deep-learning-based aerial remote sensing image detection methods, introduces the publicly available datasets for aerial remote sensing image object detection, and compares the performances of typical methods through experiments.Finally, the problems in the current research of aerial remote sensing image object detection are presented, and future research and development trends are prospected.
航空遥感图像目标检测旨在定位和识别遥感图像中感兴趣的目标,是航空遥感图像智能解译的关键技术,在情报侦察、灾害救援和资源勘探等领域具有重要应用价值。然而由于航空遥感图像具有尺寸大、目标小且密集、目标呈任意角度分布、目标易被遮挡、目标类别不均衡以及背景复杂等诸多特点,航空遥感图像目标检测目前仍然是极具挑战的任务。基于深度卷积神经网络的航空遥感图像目标检测方法因具有精度高、处理速度快等优点,受到了越来越多的关注。为推进基于深度学习的航空遥感图像目标检测技术的发展,本文对当前主流遥感图像目标检测方法,特别是 2020-2022 年提出的检测方法,进行了系统梳理和总结。首先梳理了基于深度学习目标检测方法的研究发展演化过程,然后对基于卷积神经网络和基于 Transformer 目标检测方法中的代表性算法进行分析总结,再后针对不同遥感图象应用场景的改进方法思路进行归纳,分析了典型算法的思路和特点,介绍了现有的公开航空遥感图像目标检测数据集,给出了典型算法的实验比较结果,最后给出现阶段航空遥感图像目标检测研究中所存在的问题,并对未来研究及发展趋势进行了展望。;Given the successful development of aerospace technology, high-resolution remote-sensing images have been used in daily research.The earlier low-resolution images limit researchers'interpretation of image information.In comparison, today's high-resolution remote sensing images contain rich geographic and entity detail features.They are also rich in spatial structure and semantic information.Thus, they can greatly promote the development of research in this field.Aerial remote sensing image object detection aims to provide the category and location of the target of interest in aerial remote sensing images and present evidence for further information interpretation reasoning.This technology is crucial for aerial remote sensing image interpretation and has important applications in intelligence reconnaissance, target surveillance, and disaster rescue.The early remote sensing image object detection task mainly relies on manual interpretation.The interpretation results are greatly affected by subjective factors, such as the experience and energy of the interpreters.Moreover, the timeliness is low.Various remote sensing image object detection methods based on machine learning technology have been proposed with the progress and development of machine learning technology.Traditional machine learning-based object detection techniques generally use manually designed models to extract feature information, such as feature spectrum, gray value, texture, and shape of remote sensing images, after generating sliding windows.Then, they feed the extracted feature information into classifiers, such as support vector machine(SVM)and adaptive boosting(AdaBoost), to achieve object detection in remote sensing images.These methods design the corresponding feature extraction models for specific targets with strong interpretability but weak feature expression capability, poor generalization, time-consuming computation, and low accuracy.These features make meeting the needs of accurate and efficient object detection tasks challenging in complex and variable application scenarios.In recent years, the research on the application of deep learning in remote sensing image processing has received considerable attention and become a hotspot because of the wide application of deep learning techniques, such as deep convolutional neural networks and generative adversarial neural networks, in the fields of natural image object detection, classification, and recognition, and the excellent performance in the task of large-scale natural scene image object detection.Thus, many excellent works have emerged.Object detection in aerial remote sensing images mainly faces challenges, such as large-size and high-resolution images, interference from complex backgrounds, target direction diversity, dense targets, dramatic scale changes, and small targets.At present, these challenges have corresponding model improvement methods.For large-scale natural scene image object detection, high-resolution aerial remote sensing images are used because the target scale in the image is widely distributed.This approach ensures the integrity of small target detail information.Thus, the most commonly used detection and recognition method involves segmenting the image during data preprocessing;that is, the large image is segmented into regular image sizes and sent to the object detection algorithm for detection and recognition in turn.In the subsequent processing, all the detection results are finally stitched together and reset to complete the detection of the whole image.Moreover, the aerial remote sensing image with the ultrahigh resolution has a complex background.The target to be detected is easily interfered with by various similar objects, and the similar targets to be detected present different characteristics.Thus, false detection quickly occurs during detection.Therefore, the usual methods for solving complex background interference can be divided into two types:extracting the contextual information in the image and improving the attention mechanism.The targets to be detected in the images for the complex multidirectional and multitarget situations are multidirectional because the aerial remote sensing images are all top-down images.Moreover, the aspect ratio range of the targets to be detected is more diverse than that of the targets in the natural images.Thus, the interference between the targets is serious, thereby affecting the accuracy of the final target localization and classification.
中国图象图形学报Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
1.20
自引率
0.00%
发文量
6776
期刊介绍:
Journal of Image and Graphics (ISSN 1006-8961, CN 11-3758/TB, CODEN ZTTXFZ) is an authoritative academic journal supervised by the Chinese Academy of Sciences and co-sponsored by the Institute of Space and Astronautical Information Innovation of the Chinese Academy of Sciences (ISIAS), the Chinese Society of Image and Graphics (CSIG), and the Beijing Institute of Applied Physics and Computational Mathematics (BIAPM). The journal integrates high-tech theories, technical methods and industrialisation of applied research results in computer image graphics, and mainly publishes innovative and high-level scientific research papers on basic and applied research in image graphics science and its closely related fields. The form of papers includes reviews, technical reports, project progress, academic news, new technology reviews, new product introduction and industrialisation research. The content covers a wide range of fields such as image analysis and recognition, image understanding and computer vision, computer graphics, virtual reality and augmented reality, system simulation, animation, etc., and theme columns are opened according to the research hotspots and cutting-edge topics.
Journal of Image and Graphics reaches a wide range of readers, including scientific and technical personnel, enterprise supervisors, and postgraduates and college students of colleges and universities engaged in the fields of national defence, military, aviation, aerospace, communications, electronics, automotive, agriculture, meteorology, environmental protection, remote sensing, mapping, oil field, construction, transportation, finance, telecommunications, education, medical care, film and television, and art.
Journal of Image and Graphics is included in many important domestic and international scientific literature database systems, including EBSCO database in the United States, JST database in Japan, Scopus database in the Netherlands, China Science and Technology Thesis Statistics and Analysis (Annual Research Report), China Science Citation Database (CSCD), China Academic Journal Network Publishing Database (CAJD), and China Academic Journal Network Publishing Database (CAJD). China Science Citation Database (CSCD), China Academic Journals Network Publishing Database (CAJD), China Academic Journal Abstracts, Chinese Science Abstracts (Series A), China Electronic Science Abstracts, Chinese Core Journals Abstracts, Chinese Academic Journals on CD-ROM, and China Academic Journals Comprehensive Evaluation Database.