{"title":"利用自制近红外光谱仪预测农业土壤有机质和全碳的近红外数据集","authors":"Natchanon Santasup , Parichat Theanjumpol , Choochad Santasup , Sila Kittiwachana , Nipon Mawan , Nuttapon Khongdee","doi":"10.1016/j.dib.2025.111840","DOIUrl":null,"url":null,"abstract":"<div><div>The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R<sup>2</sup>) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.</div></div>","PeriodicalId":10973,"journal":{"name":"Data in Brief","volume":"61 ","pages":"Article 111840"},"PeriodicalIF":1.0000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Dataset of near-infrared (NIR) spectral data for prediction of organic matter and total carbon in agricultural soil using homemade NIR spectrometer\",\"authors\":\"Natchanon Santasup , Parichat Theanjumpol , Choochad Santasup , Sila Kittiwachana , Nipon Mawan , Nuttapon Khongdee\",\"doi\":\"10.1016/j.dib.2025.111840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R<sup>2</sup>) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.</div></div>\",\"PeriodicalId\":10973,\"journal\":{\"name\":\"Data in Brief\",\"volume\":\"61 \",\"pages\":\"Article 111840\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data in Brief\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2352340925005670\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data in Brief","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352340925005670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Dataset of near-infrared (NIR) spectral data for prediction of organic matter and total carbon in agricultural soil using homemade NIR spectrometer
The paper presents the spectroscopic data obtained from a homemade NIR spectrometer developed for agricultural quality analysis, along with the calibration and validation of a model database for predicting agricultural soil properties. We collected NIR spectral data from 190 soil samples taken at a depth of 0-20 cm from agricultural areas in northern Thailand, including vegetable farms, orchards, and field crops. The acquisition process started by air-drying the soil and sieving it through 2.0 mm and 0.5 mm mesh. Six preprocessing techniques, including Savitzky-Golay smoothing, multiplicative scatter correction (MSC), standard normal variate (SNV), first derivative, second derivative, and mean centering, were used with partial least squares (PLS) regression to create the prediction model for soil organic matter and total carbon. Seventy percent of the sample was divided into calibration and the remaining thirty percent was validation. The most suitable model for assessing soil organic matter (SOM) and total carbon is Savitzky-Golay smoothing through the PLSR model, with a coefficient of determination (R2) of 0.79 and 0.78, a root mean square error (RMSE) of 0.701% and 0.382% for validation samples, respectively. Thus, the NIR dataset spanning 900-1,700 nm proved to be an ideal wavelength range for developing a portable/handheld NIR spectrometer, with potential for further accuracy improvements through model refinement.
期刊介绍:
Data in Brief provides a way for researchers to easily share and reuse each other''s datasets by publishing data articles that: -Thoroughly describe your data, facilitating reproducibility. -Make your data, which is often buried in supplementary material, easier to find. -Increase traffic towards associated research articles and data, leading to more citations. -Open up doors for new collaborations. Because you never know what data will be useful to someone else, Data in Brief welcomes submissions that describe data from all research areas.