Ibnu F. Kurniawan, Fei He, Iswan Dunggio, Marini S. Hamidun, Zulham Sirajuddin, Muhammad Aziz, A. Taufiq Asyhari
{"title":"Imbalanced learning of remotely sensed data for bioenergy source identification in a forest in the Wallacea region of Indonesia","authors":"Ibnu F. Kurniawan, Fei He, Iswan Dunggio, Marini S. Hamidun, Zulham Sirajuddin, Muhammad Aziz, A. Taufiq Asyhari","doi":"10.1080/2150704x.2023.2270107","DOIUrl":null,"url":null,"abstract":"ABSTRACTRemote sensing technologies have been increasingly crucial to support policy-makers in achieving their ecological strategies. The data provided by such technology can estimate the bioenergy source production rate and monitor deforestation. This work participates in the cause by contributing an aerial dataset and developing an intelligent tree-detection system usable for counting trees with the bioenergy potential. Low-altitude flying units have been vastly used for such a purpose due to their ability to capture high-quality data from distant locations. Despite these potentials, collected images that compose a dataset are often characterized by imbalanced distribution among classes. The class disproportion can affect the overall model performance, as it severely deprives key features of under-represented classes. This study proposes data-level approaches that adopt and extend prior sampling algorithms for object detection problems. The devised techniques try to reduce the number of redundant outputs obtained from sampling methods and reduce the iteration required to achieve the target imbalance ratio by employing a systematic flow. In such a process, the class distribution of an original dataset is used as a guideline for selecting candidates for subsequent processes. Our results show that the modified dataset can reduce the length of a training process shown by fewer iterations required to achieve the final metrics of its original dataset version and lower training losses in each iteration. Additionally, the modified dataset can improve the F-score (F1) and precision metric of object detection algorithm by up to 6%.KEYWORDS: Aerial surveillanceUrban forestryRemote monitoringClass imbalancedObject detectionMachine learning Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was supported in part by the British Council COP26 Trilateral Research Initiative grant under the project ”Scaling-up Indonesian Bioenergy Potential through Assessment of Wallacea’s Plant Species: Data-Driven Energy Harvesting and Community-Centred Approach”. Ibnu F. Kurniawan acknowledged the support from the Directorate General of Higher Education, Research, and Technology, Indonesia.","PeriodicalId":49132,"journal":{"name":"Remote Sensing Letters","volume":"1 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Remote Sensing Letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2150704x.2023.2270107","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACTRemote sensing technologies have been increasingly crucial to support policy-makers in achieving their ecological strategies. The data provided by such technology can estimate the bioenergy source production rate and monitor deforestation. This work participates in the cause by contributing an aerial dataset and developing an intelligent tree-detection system usable for counting trees with the bioenergy potential. Low-altitude flying units have been vastly used for such a purpose due to their ability to capture high-quality data from distant locations. Despite these potentials, collected images that compose a dataset are often characterized by imbalanced distribution among classes. The class disproportion can affect the overall model performance, as it severely deprives key features of under-represented classes. This study proposes data-level approaches that adopt and extend prior sampling algorithms for object detection problems. The devised techniques try to reduce the number of redundant outputs obtained from sampling methods and reduce the iteration required to achieve the target imbalance ratio by employing a systematic flow. In such a process, the class distribution of an original dataset is used as a guideline for selecting candidates for subsequent processes. Our results show that the modified dataset can reduce the length of a training process shown by fewer iterations required to achieve the final metrics of its original dataset version and lower training losses in each iteration. Additionally, the modified dataset can improve the F-score (F1) and precision metric of object detection algorithm by up to 6%.KEYWORDS: Aerial surveillanceUrban forestryRemote monitoringClass imbalancedObject detectionMachine learning Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was supported in part by the British Council COP26 Trilateral Research Initiative grant under the project ”Scaling-up Indonesian Bioenergy Potential through Assessment of Wallacea’s Plant Species: Data-Driven Energy Harvesting and Community-Centred Approach”. Ibnu F. Kurniawan acknowledged the support from the Directorate General of Higher Education, Research, and Technology, Indonesia.
遥感技术在支持决策者实现其生态战略方面发挥着越来越重要的作用。这种技术提供的数据可以估计生物能源的生产速度和监测森林砍伐。这项工作通过提供航空数据集和开发可用于计算具有生物能源潜力的树木的智能树木检测系统来参与这项事业。由于低空飞行单位能够从遥远地点捕获高质量数据,因此已广泛用于这一目的。尽管有这些潜力,收集到的图像组成的数据集往往具有类之间分布不平衡的特点。类比例失调会影响整体模型性能,因为它严重剥夺了代表性不足的类的关键特征。本研究提出了数据级方法,采用并扩展了目标检测问题的先验采样算法。所设计的技术试图减少从采样方法中获得的冗余输出的数量,并通过采用系统流程减少实现目标不平衡比所需的迭代。在这个过程中,原始数据集的类分布被用作后续过程选择候选的指导方针。我们的研究结果表明,修改后的数据集可以减少训练过程的长度,通过更少的迭代来达到原始数据集版本的最终指标,并且减少每次迭代的训练损失。此外,改进后的数据集可将目标检测算法的F-score (F1)和精度指标提高6%。关键词:航空监测城市林业远程监测类失衡对象检测机器学习披露声明作者未报告潜在的利益冲突。这项工作得到了英国文化协会COP26三边研究倡议项目“通过评估Wallacea植物物种扩大印度尼西亚生物能源潜力:数据驱动的能源收集和以社区为中心的方法”的部分资助。Ibnu F. Kurniawan感谢印度尼西亚高等教育、研究和技术总局的支持。
期刊介绍:
Remote Sensing Letters is a peer-reviewed international journal committed to the rapid publication of articles advancing the science and technology of remote sensing as well as its applications. The journal originates from a successful section, of the same name, contained in the International Journal of Remote Sensing from 1983 –2009. Articles may address any aspect of remote sensing of relevance to the journal’s readership, including – but not limited to – developments in sensor technology, advances in image processing and Earth-orientated applications, whether terrestrial, oceanic or atmospheric. Articles should make a positive impact on the subject by either contributing new and original information or through provision of theoretical, methodological or commentary material that acts to strengthen the subject.