利用机器学习算法探索训练样本对作物制图精度的影响

IF 5.7 Q1 ENVIRONMENTAL SCIENCES
Yangyang Fu , Ruoque Shen , Chaoqing Song , Jie Dong , Wei Han , Tao Ye , Wenping Yuan
{"title":"利用机器学习算法探索训练样本对作物制图精度的影响","authors":"Yangyang Fu ,&nbsp;Ruoque Shen ,&nbsp;Chaoqing Song ,&nbsp;Jie Dong ,&nbsp;Wei Han ,&nbsp;Tao Ye ,&nbsp;Wenping Yuan","doi":"10.1016/j.srs.2023.100081","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.</p></div>","PeriodicalId":101147,"journal":{"name":"Science of Remote Sensing","volume":"7 ","pages":"Article 100081"},"PeriodicalIF":5.7000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm\",\"authors\":\"Yangyang Fu ,&nbsp;Ruoque Shen ,&nbsp;Chaoqing Song ,&nbsp;Jie Dong ,&nbsp;Wei Han ,&nbsp;Tao Ye ,&nbsp;Wenping Yuan\",\"doi\":\"10.1016/j.srs.2023.100081\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.</p></div>\",\"PeriodicalId\":101147,\"journal\":{\"name\":\"Science of Remote Sensing\",\"volume\":\"7 \",\"pages\":\"Article 100081\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science of Remote Sensing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666017223000068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science of Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666017223000068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 1

摘要

机器学习算法是一种常用的作物分类方法,已被应用于识别各种作物在区域和国家尺度上的分布。先前的研究强调,训练样本的数量强烈影响机器学习算法的分类精度,导致了大量的训练样本收集工作。本研究以冬小麦为例,通过用时间加权动态时间扭曲(TWDTW)方法选择训练样本来挑战上述原理,发现机器学习算法的分类精度高度依赖于训练样本的代表性和比例,而不是数量。随着训练样本代表性的增加,即更全面地反映冬小麦的特征,分类精度不断提高。当根据冬小麦和非冬小麦的统计区域比例选择训练样本时,进一步获得了最佳的分类精度。相反,当使用1000和10000个训练样本时,在总体准确率(91.26%和90.74%)、生产者准确率(86.33%和86.65%)和用户准确率(97.37%和96.01%)方面仅发现轻微差异。总之,本研究表明,训练样本的特征对机器学习算法的分类精度有很大影响,TWDTW方法生成的训练样本对于作物映射是可靠的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm

Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
12.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信