利用机器学习算法探索训练样本对作物制图精度的影响

IF 5.7 Q1 ENVIRONMENTAL SCIENCES

Science of Remote Sensing Pub Date : 2023-06-01 DOI:10.1016/j.srs.2023.100081

Yangyang Fu , Ruoque Shen , Chaoqing Song , Jie Dong , Wei Han , Tao Ye , Wenping Yuan

{"title":"利用机器学习算法探索训练样本对作物制图精度的影响","authors":"Yangyang Fu , Ruoque Shen , Chaoqing Song , Jie Dong , Wei Han , Tao Ye , Wenping Yuan","doi":"10.1016/j.srs.2023.100081","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.</p></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"7 ","pages":"Article 100081"},"PeriodicalIF":5.7000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm\",\"authors\":\"Yangyang Fu , Ruoque Shen , Chaoqing Song , Jie Dong , Wei Han , Tao Ye , Wenping Yuan\",\"doi\":\"10.1016/j.srs.2023.100081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.</p></div>\",\"PeriodicalId\":101147,\"journal\":{\"name\":\"Science of Remote Sensing\",\"volume\":\"7 \",\"pages\":\"Article 100081\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666017223000068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017223000068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 1

摘要

机器学习算法是一种常用的作物分类方法，已被应用于识别各种作物在区域和国家尺度上的分布。先前的研究强调，训练样本的数量强烈影响机器学习算法的分类精度，导致了大量的训练样本收集工作。本研究以冬小麦为例，通过用时间加权动态时间扭曲（TWDTW）方法选择训练样本来挑战上述原理，发现机器学习算法的分类精度高度依赖于训练样本的代表性和比例，而不是数量。随着训练样本代表性的增加，即更全面地反映冬小麦的特征，分类精度不断提高。当根据冬小麦和非冬小麦的统计区域比例选择训练样本时，进一步获得了最佳的分类精度。相反，当使用1000和10000个训练样本时，在总体准确率（91.26%和90.74%）、生产者准确率（86.33%和86.65%）和用户准确率（97.37%和96.01%）方面仅发现轻微差异。总之，本研究表明，训练样本的特征对机器学习算法的分类精度有很大影响，TWDTW方法生成的训练样本对于作物映射是可靠的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm

Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Science of Remote Sensing

CiteScore

12.20

自引率

0.00%

发文量