{"title":"联邦学习在土壤光谱学中的应用","authors":"Giannis Gallios , Nikolaos Tsakiridis , Nikolaos Tziolas","doi":"10.1016/j.geoderma.2025.117259","DOIUrl":null,"url":null,"abstract":"<div><div>Soil spectroscopy has emerged as a key technique for rapid, non-destructive soil property prediction, yet the centralized nature of the training of machine learning models raises concerns around data privacy, accessibility, and transferability. This study proposes the application of Federated Learning (FL) as a decentralized approach to soil spectroscopy, enabling collaboration between multiple data contributors (or clients) without requiring the exchange of raw data. Convolutional neural networks were used to estimate key soil attributes such as soil organic carbon, texture, pH, cation exchange, and total nitrogen using VNIR or MIR spectral data from the Kellogg Soil Survey Laboratory. Three data partitioning approaches were explored involving geopolitical, bioclimatic, independent, and identically distributed (IID) splitting methods to simulate various real-world scenarios of particular interest to the soil community. Each scenario was investigated under two different averaging aggregation strategies: Federated Averaging (FedAvg) and Weighted Averaging (WgtAvg), which are used to develop a consensus model by aggregating the weights of the different contributors. The results show that the FL framework can match, and in some cases exceed, the performance of centralized models, particularly when using IID data. The WgtAvg strategy was particularly effective in boosting prediction accuracy even by 50 % for soil properties where contributors had unequal data sizes. This study highlights the potential of FL as a privacy-preserving framework across networks of soil labs, facilitating enhanced soil health monitoring with decentralized global models.</div></div>","PeriodicalId":12511,"journal":{"name":"Geoderma","volume":"456 ","pages":"Article 117259"},"PeriodicalIF":5.6000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated learning applications in soil spectroscopy\",\"authors\":\"Giannis Gallios , Nikolaos Tsakiridis , Nikolaos Tziolas\",\"doi\":\"10.1016/j.geoderma.2025.117259\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Soil spectroscopy has emerged as a key technique for rapid, non-destructive soil property prediction, yet the centralized nature of the training of machine learning models raises concerns around data privacy, accessibility, and transferability. This study proposes the application of Federated Learning (FL) as a decentralized approach to soil spectroscopy, enabling collaboration between multiple data contributors (or clients) without requiring the exchange of raw data. Convolutional neural networks were used to estimate key soil attributes such as soil organic carbon, texture, pH, cation exchange, and total nitrogen using VNIR or MIR spectral data from the Kellogg Soil Survey Laboratory. Three data partitioning approaches were explored involving geopolitical, bioclimatic, independent, and identically distributed (IID) splitting methods to simulate various real-world scenarios of particular interest to the soil community. Each scenario was investigated under two different averaging aggregation strategies: Federated Averaging (FedAvg) and Weighted Averaging (WgtAvg), which are used to develop a consensus model by aggregating the weights of the different contributors. The results show that the FL framework can match, and in some cases exceed, the performance of centralized models, particularly when using IID data. The WgtAvg strategy was particularly effective in boosting prediction accuracy even by 50 % for soil properties where contributors had unequal data sizes. This study highlights the potential of FL as a privacy-preserving framework across networks of soil labs, facilitating enhanced soil health monitoring with decentralized global models.</div></div>\",\"PeriodicalId\":12511,\"journal\":{\"name\":\"Geoderma\",\"volume\":\"456 \",\"pages\":\"Article 117259\"},\"PeriodicalIF\":5.6000,\"publicationDate\":\"2025-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Geoderma\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0016706125000977\",\"RegionNum\":1,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"SOIL SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoderma","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0016706125000977","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SOIL SCIENCE","Score":null,"Total":0}
Federated learning applications in soil spectroscopy
Soil spectroscopy has emerged as a key technique for rapid, non-destructive soil property prediction, yet the centralized nature of the training of machine learning models raises concerns around data privacy, accessibility, and transferability. This study proposes the application of Federated Learning (FL) as a decentralized approach to soil spectroscopy, enabling collaboration between multiple data contributors (or clients) without requiring the exchange of raw data. Convolutional neural networks were used to estimate key soil attributes such as soil organic carbon, texture, pH, cation exchange, and total nitrogen using VNIR or MIR spectral data from the Kellogg Soil Survey Laboratory. Three data partitioning approaches were explored involving geopolitical, bioclimatic, independent, and identically distributed (IID) splitting methods to simulate various real-world scenarios of particular interest to the soil community. Each scenario was investigated under two different averaging aggregation strategies: Federated Averaging (FedAvg) and Weighted Averaging (WgtAvg), which are used to develop a consensus model by aggregating the weights of the different contributors. The results show that the FL framework can match, and in some cases exceed, the performance of centralized models, particularly when using IID data. The WgtAvg strategy was particularly effective in boosting prediction accuracy even by 50 % for soil properties where contributors had unequal data sizes. This study highlights the potential of FL as a privacy-preserving framework across networks of soil labs, facilitating enhanced soil health monitoring with decentralized global models.
期刊介绍:
Geoderma - the global journal of soil science - welcomes authors, readers and soil research from all parts of the world, encourages worldwide soil studies, and embraces all aspects of soil science and its associated pedagogy. The journal particularly welcomes interdisciplinary work focusing on dynamic soil processes and functions across space and time.