{"title":"基于近红外实验室光谱的土壤质地分类精度评估","authors":"Ternikar Chirag Rajendra , Cécile Gomez , Subramanian Dharumarajan , D. Nagesh Kumar","doi":"10.1016/j.chemolab.2025.105419","DOIUrl":null,"url":null,"abstract":"<div><div>Soil texture is an important soil parameter controlling various physical, chemical and biological soil properties. Visible Near-Infrared (VNIR) spectroscopy has garnered attention due to its simplicity, non-destructive nature, absence of hazards and rapidity. Due to its popularity, numerous studies employ this technique without adhering to the unity constraint on predicted fractions. This study aims to assess the accuracy of soil texture classification in the USDA textural triangle through laboratory VNIR spectra. Five different approaches were evaluated in this study: i) four approaches (<em>A1-A4</em>), defined as regression-assisted classification techniques, were based on the Partial Least Squares Regression (PLSR) method to predict quantitative fractions followed by a texture classification based on the USDA texture triangle and ii) one approach (<em>A5</em>), defined as a direct classification method, was based on the Partial Least Squares Discriminant Analysis (PLS-DA) classifier to classify soil texture using spectra directly. Each regression-assisted classification approach varies in predicting fractions and ensuring the unity constraint on the predicted fractions. In approach <em>A1</em>, the clay, silt and sand fractions predicted by PLSR for each sample were normalized to ensure sum-to-unity. In approach <em>A2</em>, the silt content was derived as residual from the clay and sand contents predicted by PLSR for each sample, ensuring unity. In Approach <em>A3</em>, the clay, silt and sand fractions were simultaneously predicted using a multi-output variant of PLSR. Approach <em>A4</em> employed PLSR on log-ratio transformed (LRT) fractions, enabling simultaneous prediction and inherently ensuring sum-to-unity. Approach <em>A4</em> via LRT utilizes information about the relative fractions of soil texture instead of the absolute fractions. For the regression-based fraction predictions, approaches (<em>A1-A4</em>) achieved similar performances, with mean coefficients of determination (R<sup>2</sup>) of 0.88–0.90 for clay (RMSE: 4.2–4.4 %), 0.82–0.84 for sand (RMSE: 6.1–6.5 %), but lower (R<sup>2</sup> = 0.29–0.38) for silt (RMSE: 3.8–4.1 %). Approach <em>A2</em>, which infers silt as a residual, yielded poorer silt predictions. Despite these quantitative differences, the resulting classification accuracies in the USDA texture triangle were high with overall accuracy of 71–71.8 %, average accuracy of 62.4–65.3 % and Cohen's Kappa of 0.61–0.62 for <em>A1</em>-<em>A4</em> while <em>A5</em>, attained only 56.4 % overall accuracy and Cohen's Kappa of 0.42. Among the regression-assisted methods, Approach <em>A4</em> using log-ratio transformations of clay, silt, and sand simultaneously enforced compositional constraints and matched the best classification performances (OA = 71.4 %, AA = 65.3 %, K = 0.62) while requiring fewer models. This work highlighted that i) the four regression-assisted classification approaches provided comparable and correct performances of soil texture classification, ii) the direct classification approach provided modest performance, iii) regression-assisted classification approaches outperformed the direct classification approach, and iv) in any of the approaches, the misclassifications were typically into the neighbouring textural classes. This study aided in the creation of accurate and effective approaches for classifying soil texture by evaluating their performance and suitability. Among these, Approach <em>A4</em>, involving PLSR with log-ratio transformation, displays promise and warrants further evaluation on broader datasets and potential application on airborne or spaceborne platforms.</div></div>","PeriodicalId":9774,"journal":{"name":"Chemometrics and Intelligent Laboratory Systems","volume":"263 ","pages":"Article 105419"},"PeriodicalIF":3.7000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Assessing soil texture classification accuracy based on VNIR lab spectroscopy\",\"authors\":\"Ternikar Chirag Rajendra , Cécile Gomez , Subramanian Dharumarajan , D. Nagesh Kumar\",\"doi\":\"10.1016/j.chemolab.2025.105419\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Soil texture is an important soil parameter controlling various physical, chemical and biological soil properties. Visible Near-Infrared (VNIR) spectroscopy has garnered attention due to its simplicity, non-destructive nature, absence of hazards and rapidity. Due to its popularity, numerous studies employ this technique without adhering to the unity constraint on predicted fractions. This study aims to assess the accuracy of soil texture classification in the USDA textural triangle through laboratory VNIR spectra. Five different approaches were evaluated in this study: i) four approaches (<em>A1-A4</em>), defined as regression-assisted classification techniques, were based on the Partial Least Squares Regression (PLSR) method to predict quantitative fractions followed by a texture classification based on the USDA texture triangle and ii) one approach (<em>A5</em>), defined as a direct classification method, was based on the Partial Least Squares Discriminant Analysis (PLS-DA) classifier to classify soil texture using spectra directly. Each regression-assisted classification approach varies in predicting fractions and ensuring the unity constraint on the predicted fractions. In approach <em>A1</em>, the clay, silt and sand fractions predicted by PLSR for each sample were normalized to ensure sum-to-unity. In approach <em>A2</em>, the silt content was derived as residual from the clay and sand contents predicted by PLSR for each sample, ensuring unity. In Approach <em>A3</em>, the clay, silt and sand fractions were simultaneously predicted using a multi-output variant of PLSR. Approach <em>A4</em> employed PLSR on log-ratio transformed (LRT) fractions, enabling simultaneous prediction and inherently ensuring sum-to-unity. Approach <em>A4</em> via LRT utilizes information about the relative fractions of soil texture instead of the absolute fractions. For the regression-based fraction predictions, approaches (<em>A1-A4</em>) achieved similar performances, with mean coefficients of determination (R<sup>2</sup>) of 0.88–0.90 for clay (RMSE: 4.2–4.4 %), 0.82–0.84 for sand (RMSE: 6.1–6.5 %), but lower (R<sup>2</sup> = 0.29–0.38) for silt (RMSE: 3.8–4.1 %). Approach <em>A2</em>, which infers silt as a residual, yielded poorer silt predictions. Despite these quantitative differences, the resulting classification accuracies in the USDA texture triangle were high with overall accuracy of 71–71.8 %, average accuracy of 62.4–65.3 % and Cohen's Kappa of 0.61–0.62 for <em>A1</em>-<em>A4</em> while <em>A5</em>, attained only 56.4 % overall accuracy and Cohen's Kappa of 0.42. Among the regression-assisted methods, Approach <em>A4</em> using log-ratio transformations of clay, silt, and sand simultaneously enforced compositional constraints and matched the best classification performances (OA = 71.4 %, AA = 65.3 %, K = 0.62) while requiring fewer models. This work highlighted that i) the four regression-assisted classification approaches provided comparable and correct performances of soil texture classification, ii) the direct classification approach provided modest performance, iii) regression-assisted classification approaches outperformed the direct classification approach, and iv) in any of the approaches, the misclassifications were typically into the neighbouring textural classes. This study aided in the creation of accurate and effective approaches for classifying soil texture by evaluating their performance and suitability. Among these, Approach <em>A4</em>, involving PLSR with log-ratio transformation, displays promise and warrants further evaluation on broader datasets and potential application on airborne or spaceborne platforms.</div></div>\",\"PeriodicalId\":9774,\"journal\":{\"name\":\"Chemometrics and Intelligent Laboratory Systems\",\"volume\":\"263 \",\"pages\":\"Article 105419\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2025-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Chemometrics and Intelligent Laboratory Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169743925001042\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chemometrics and Intelligent Laboratory Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169743925001042","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Assessing soil texture classification accuracy based on VNIR lab spectroscopy
Soil texture is an important soil parameter controlling various physical, chemical and biological soil properties. Visible Near-Infrared (VNIR) spectroscopy has garnered attention due to its simplicity, non-destructive nature, absence of hazards and rapidity. Due to its popularity, numerous studies employ this technique without adhering to the unity constraint on predicted fractions. This study aims to assess the accuracy of soil texture classification in the USDA textural triangle through laboratory VNIR spectra. Five different approaches were evaluated in this study: i) four approaches (A1-A4), defined as regression-assisted classification techniques, were based on the Partial Least Squares Regression (PLSR) method to predict quantitative fractions followed by a texture classification based on the USDA texture triangle and ii) one approach (A5), defined as a direct classification method, was based on the Partial Least Squares Discriminant Analysis (PLS-DA) classifier to classify soil texture using spectra directly. Each regression-assisted classification approach varies in predicting fractions and ensuring the unity constraint on the predicted fractions. In approach A1, the clay, silt and sand fractions predicted by PLSR for each sample were normalized to ensure sum-to-unity. In approach A2, the silt content was derived as residual from the clay and sand contents predicted by PLSR for each sample, ensuring unity. In Approach A3, the clay, silt and sand fractions were simultaneously predicted using a multi-output variant of PLSR. Approach A4 employed PLSR on log-ratio transformed (LRT) fractions, enabling simultaneous prediction and inherently ensuring sum-to-unity. Approach A4 via LRT utilizes information about the relative fractions of soil texture instead of the absolute fractions. For the regression-based fraction predictions, approaches (A1-A4) achieved similar performances, with mean coefficients of determination (R2) of 0.88–0.90 for clay (RMSE: 4.2–4.4 %), 0.82–0.84 for sand (RMSE: 6.1–6.5 %), but lower (R2 = 0.29–0.38) for silt (RMSE: 3.8–4.1 %). Approach A2, which infers silt as a residual, yielded poorer silt predictions. Despite these quantitative differences, the resulting classification accuracies in the USDA texture triangle were high with overall accuracy of 71–71.8 %, average accuracy of 62.4–65.3 % and Cohen's Kappa of 0.61–0.62 for A1-A4 while A5, attained only 56.4 % overall accuracy and Cohen's Kappa of 0.42. Among the regression-assisted methods, Approach A4 using log-ratio transformations of clay, silt, and sand simultaneously enforced compositional constraints and matched the best classification performances (OA = 71.4 %, AA = 65.3 %, K = 0.62) while requiring fewer models. This work highlighted that i) the four regression-assisted classification approaches provided comparable and correct performances of soil texture classification, ii) the direct classification approach provided modest performance, iii) regression-assisted classification approaches outperformed the direct classification approach, and iv) in any of the approaches, the misclassifications were typically into the neighbouring textural classes. This study aided in the creation of accurate and effective approaches for classifying soil texture by evaluating their performance and suitability. Among these, Approach A4, involving PLSR with log-ratio transformation, displays promise and warrants further evaluation on broader datasets and potential application on airborne or spaceborne platforms.
期刊介绍:
Chemometrics and Intelligent Laboratory Systems publishes original research papers, short communications, reviews, tutorials and Original Software Publications reporting on development of novel statistical, mathematical, or computer techniques in Chemistry and related disciplines.
Chemometrics is the chemical discipline that uses mathematical and statistical methods to design or select optimal procedures and experiments, and to provide maximum chemical information by analysing chemical data.
The journal deals with the following topics:
1) Development of new statistical, mathematical and chemometrical methods for Chemistry and related fields (Environmental Chemistry, Biochemistry, Toxicology, System Biology, -Omics, etc.)
2) Novel applications of chemometrics to all branches of Chemistry and related fields (typical domains of interest are: process data analysis, experimental design, data mining, signal processing, supervised modelling, decision making, robust statistics, mixture analysis, multivariate calibration etc.) Routine applications of established chemometrical techniques will not be considered.
3) Development of new software that provides novel tools or truly advances the use of chemometrical methods.
4) Well characterized data sets to test performance for the new methods and software.
The journal complies with International Committee of Medical Journal Editors'' Uniform requirements for manuscripts.