{"title":"Learning Orientation-Estimation Convolutional Neural Network for Building Detection in Optical Remote Sensing Image","authors":"Yongliang Chen, W. Gong, Chaoyue Chen, Weihong Li","doi":"10.1109/DICTA.2018.8615859","DOIUrl":null,"url":null,"abstract":"Benefiting from the great success of deep learning in computer vision, object detection with Convolutional Neural Network (CNN) based methods have drawn significant attentions. Various frameworks have been proposed which show awesome and robust performance for a large range of datasets. However, for building detection in remote sensing images, buildings always pose a diversity of orientation which makes it a challenge for the application of off-the-shelf methods to building detection in remote sensing images. In this work, we aim to integrate orientation regression into the popular axis-aligned bounding box to tackle this problem. To adapt the axis-aligned bounding boxes to arbitrarily orientated ones, we also develop an algorithm to estimate the Intersection Over Union (IOU) overlap between any two arbitrarily oriented boxes which is convenient to implement in Graphics Processing Unit (GPU) for fast computation. The proposed method utilizes CNN for both robust feature extraction and bounding box regression. We present our model in an end-to-end fashion making it easy to train. The model is formulated and trained to predict both orientation and location simultaneously obtaining tighter bounding box and hence, higher mean average precision (mAP). Experiments on remote sensing images of different scales shows a promising performance over the conventional one.","PeriodicalId":130057,"journal":{"name":"2018 Digital Image Computing: Techniques and Applications (DICTA)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Digital Image Computing: Techniques and Applications (DICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DICTA.2018.8615859","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Benefiting from the great success of deep learning in computer vision, object detection with Convolutional Neural Network (CNN) based methods have drawn significant attentions. Various frameworks have been proposed which show awesome and robust performance for a large range of datasets. However, for building detection in remote sensing images, buildings always pose a diversity of orientation which makes it a challenge for the application of off-the-shelf methods to building detection in remote sensing images. In this work, we aim to integrate orientation regression into the popular axis-aligned bounding box to tackle this problem. To adapt the axis-aligned bounding boxes to arbitrarily orientated ones, we also develop an algorithm to estimate the Intersection Over Union (IOU) overlap between any two arbitrarily oriented boxes which is convenient to implement in Graphics Processing Unit (GPU) for fast computation. The proposed method utilizes CNN for both robust feature extraction and bounding box regression. We present our model in an end-to-end fashion making it easy to train. The model is formulated and trained to predict both orientation and location simultaneously obtaining tighter bounding box and hence, higher mean average precision (mAP). Experiments on remote sensing images of different scales shows a promising performance over the conventional one.
得益于深度学习在计算机视觉领域的巨大成功,基于卷积神经网络(CNN)的目标检测方法备受关注。已经提出了各种框架,它们在大范围的数据集上表现出令人敬畏和强大的性能。然而,对于遥感图像中的建筑物检测来说,建筑物总是具有多样性的方向,这给现成的方法在遥感图像中建筑物检测中的应用带来了挑战。在这项工作中,我们的目标是将方向回归集成到流行的轴对齐边界框中来解决这个问题。为了使轴线对齐的边界框适应于任意方向的边界框,我们还开发了一种算法来估计任意两个任意方向的边界框之间的交联(Intersection Over Union, IOU)重叠,该算法便于在图形处理单元(GPU)中实现,以实现快速计算。该方法利用CNN进行鲁棒特征提取和边界盒回归。我们以端到端方式呈现我们的模型,使其易于训练。模型的制定和训练可以同时预测方向和位置,从而获得更紧密的边界框,从而获得更高的平均精度(mAP)。在不同尺度遥感图像上的实验表明,该方法优于传统方法。