利用自制近红外光谱仪预测农业土壤有机质和全碳的近红外数据集

IF 1 Q3 MULTIDISCIPLINARY SCIENCES
Natchanon Santasup , Parichat Theanjumpol , Choochad Santasup , Sila Kittiwachana , Nipon Mawan , Nuttapon Khongdee
{"title":"利用自制近红外光谱仪预测农业土壤有机质和全碳的近红外数据集","authors":"Natchanon Santasup ,&nbsp;Parichat Theanjumpol ,&nbsp;Choochad Santasup ,&nbsp;Sila Kittiwachana ,&nbsp;Nipon Mawan ,&nbsp;Nuttapon Khongdee","doi":"10.1016/j.dib.2025.111840","DOIUrl":null,"url":null,"abstract":"<div><div>The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R<sup>2</sup>) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111840"},"PeriodicalIF":1.0000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dataset of near-infrared (NIR) spectral data for prediction of organic matter and total carbon in agricultural soil using homemade NIR spectrometer\",\"authors\":\"Natchanon Santasup ,&nbsp;Parichat Theanjumpol ,&nbsp;Choochad Santasup ,&nbsp;Sila Kittiwachana ,&nbsp;Nipon Mawan ,&nbsp;Nuttapon Khongdee\",\"doi\":\"10.1016/j.dib.2025.111840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R<sup>2</sup>) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.</div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"61 \",\"pages\":\"Article 111840\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352340925005670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925005670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

本文介绍了自制的用于农业质量分析的近红外光谱仪所获得的光谱数据,以及用于预测农业土壤性质的模型数据库的校准和验证。我们从泰国北部农业区0-20 cm深度的190个土壤样本中收集了近红外光谱数据,包括蔬菜农场、果园和大田作物。采集过程从风干土壤和通过2.0 mm和0.5 mm筛网进行筛选开始。采用Savitzky-Golay平滑、乘法散点校正(MSC)、标准正态变量(SNV)、一阶导数、二阶导数和均值定心等6种预处理技术,结合偏最小二乘(PLS)回归建立了土壤有机质和总碳的预测模型。样品的70%分为校准,其余30%为验证。通过PLSR模型进行Savitzky-Golay平滑是评估土壤有机质(SOM)和总碳最合适的模型,其决定系数(R2)为0.79和0.78,均方根误差(RMSE)分别为0.701%和0.382%。因此,900-1,700 nm的近红外数据集被证明是开发便携式/手持式近红外光谱仪的理想波长范围,并有可能通过模型改进进一步提高精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dataset of near-infrared (NIR) spectral data for prediction of organic matter and total carbon in agricultural soil using homemade NIR spectrometer
The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R2) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Data in Brief
Data in Brief MULTIDISCIPLINARY SCIENCES-
CiteScore
3.10
自引率
0.00%
发文量
996
审稿时长
70 days
期刊介绍: Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信