Photogrammetry engaged automated image labeling approach

IF 3.8 3区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Visual Informatics Pub Date : 2025-04-09 DOI:10.1016/j.visinf.2025.100239

Jonathan Boyack , Jongseong Brad Choi

{"title":"Photogrammetry engaged automated image labeling approach","authors":"Jonathan Boyack , Jongseong Brad Choi","doi":"10.1016/j.visinf.2025.100239","DOIUrl":null,"url":null,"abstract":"<div><div>Deep learning models require many instances of training data to be able to accurately detect the desired object. However, the labeling of images is currently conducted manually due to the inclusion of irrelevant scenes in the original images, especially for the data collected in a dynamic environment such as from drone imagery. In this work, we developed an automated extraction of training data set using photogrammetry. This approach works with continuous and arbitrary collection of visual data, such as video, encompassing a stationary object. A dense point cloud was first generated to estimate the geometric relationship between individual images using a structure-from-motion (SfM) technique, followed by user-designated region-of-interests, ROIs, that are automatically extracted from the original images. An orthophoto mosaic of the façade plane of the building shown in the point cloud was created to ease the user’s selection of an intended labeling region of the object, which is a one-time process. We verified this method by using the ROIs extracted from a previously obtained dataset to train and test a convolutional neural network which is modeled to detect damage locations. The method put forward in this work allows a relatively small amount of labeling to generate a large amount of training data. We successfully demonstrate the capabilities of the technique with the dataset previously collected by a drone from an abandoned building in which many of the glass windows have been damaged.</div></div>","PeriodicalId":36903,"journal":{"name":"Visual Informatics","volume":"9 2","pages":"Article 100239"},"PeriodicalIF":3.8000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Informatics","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468502X25000221","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Deep learning models require many instances of training data to be able to accurately detect the desired object. However, the labeling of images is currently conducted manually due to the inclusion of irrelevant scenes in the original images, especially for the data collected in a dynamic environment such as from drone imagery. In this work, we developed an automated extraction of training data set using photogrammetry. This approach works with continuous and arbitrary collection of visual data, such as video, encompassing a stationary object. A dense point cloud was first generated to estimate the geometric relationship between individual images using a structure-from-motion (SfM) technique, followed by user-designated region-of-interests, ROIs, that are automatically extracted from the original images. An orthophoto mosaic of the façade plane of the building shown in the point cloud was created to ease the user’s selection of an intended labeling region of the object, which is a one-time process. We verified this method by using the ROIs extracted from a previously obtained dataset to train and test a convolutional neural network which is modeled to detect damage locations. The method put forward in this work allows a relatively small amount of labeling to generate a large amount of training data. We successfully demonstrate the capabilities of the technique with the dataset previously collected by a drone from an abandoned building in which many of the glass windows have been damaged.

查看原文本刊更多论文

摄影测量采用自动图像标记方法

深度学习模型需要许多训练数据实例才能准确地检测到所需的对象。然而，由于原始图像中包含了不相关的场景，目前对图像的标记是手动进行的，特别是对于在动态环境中收集的数据，如无人机图像。在这项工作中，我们开发了一种使用摄影测量法自动提取训练数据集的方法。这种方法适用于连续和任意的视觉数据集合，例如包含静止物体的视频。首先使用运动结构（SfM）技术生成密集的点云来估计单个图像之间的几何关系，然后从原始图像中自动提取用户指定的兴趣区域（roi）。在点云中显示的建筑物的正面平面的正射影像马赛克被创建，以方便用户选择目标的预期标记区域，这是一个一次性的过程。我们通过使用从先前获得的数据集中提取的roi来训练和测试卷积神经网络来验证该方法，该网络被建模用于检测损伤位置。本工作提出的方法允许相对少量的标记生成大量的训练数据。我们成功地展示了该技术的能力，该数据集以前是由无人机从一座废弃的建筑物中收集的，其中许多玻璃窗已经损坏。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊