{"title":"Federated Feature Augmentation and Alignment","authors":"Tianfei Zhou;Ye Yuan;Binglu Wang;Ender Konukoglu","doi":"10.1109/TPAMI.2024.3457751","DOIUrl":null,"url":null,"abstract":"Federated learning is a distributed paradigm that allows multiple parties to collaboratively train deep learning models without direct exchange of raw data. Nevertheless, the inherent non-independent and identically distributed (non-i.i.d.) nature of data distribution among clients results in significant degradation of the acquired model. The primary goal of this study is to develop a robust federated learning algorithm to address \n<i>feature shift</i>\n in clients’ samples, potentially arising from a range of factors such as acquisition discrepancies in medical imaging. To reach this goal, we first propose federated feature augmentation (\n<small>FedFA</small>\n<inline-formula><tex-math>$^{l}$</tex-math></inline-formula>\n), a novel feature augmentation technique tailored for federated learning. \n<small>FedFA</small>\n<inline-formula><tex-math>$^{l}$</tex-math></inline-formula>\n is based on a crucial insight that each client's data distribution can be characterized by first-/second-order statistics (\n<i>a.k.a.</i>\n, mean and standard deviation) of latent features; and it is feasible to manipulate these local statistics \n<i>globally</i>\n, i.e., based on information in the entire federation, to let clients have a better sense of the global distribution across clients. Grounded on this insight, we propose to augment each local feature statistic based on a normal distribution, wherein the mean corresponds to the original statistic, and the variance defines the augmentation scope. Central to \n<small>FedFA</small>\n<inline-formula><tex-math>$^{l}$</tex-math></inline-formula>\n is the determination of a meaningful Gaussian variance, which is accomplished by taking into account not only biased data of each individual client, but also underlying feature statistics represented by all participating clients. Beyond consideration of \n<i>low-order</i>\n statistics in \n<small>FedFA</small>\n<inline-formula><tex-math>$^{l}$</tex-math></inline-formula>\n, we propose a federated feature alignment component (\n<small>FedFA</small>\n<inline-formula><tex-math>$^{h}$</tex-math></inline-formula>\n) that exploits \n<i>higher-order</i>\n feature statistics to gain a more detailed understanding of local feature distribution and enables explicit alignment of augmented features in different clients to promote more consistent feature learning. Combining \n<small>FedFA</small>\n<inline-formula><tex-math>$^{l}$</tex-math></inline-formula>\n and \n<small>FedFA</small>\n<inline-formula><tex-math>$^{h}$</tex-math></inline-formula>\n yields our full approach \n<small><b>FedFA<inline-formula><tex-math>$+$</tex-math><alternatives><mml:math><mml:mo>+</mml:mo></mml:math><inline-graphic></alternatives></inline-formula></b></small>\n. \n<small>FedFA<inline-formula><tex-math>$+$</tex-math><alternatives><mml:math><mml:mo>+</mml:mo></mml:math><inline-graphic></alternatives></inline-formula></small>\n is non-parametric, incurs negligible additional communication costs, and can be seamlessly incorporated into popular CNN and Transformer architectures. We offer rigorous theoretical analysis, as well as extensive empirical justifications to demonstrate the effectiveness of the algorithm.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"11119-11135"},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10680999/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning is a distributed paradigm that allows multiple parties to collaboratively train deep learning models without direct exchange of raw data. Nevertheless, the inherent non-independent and identically distributed (non-i.i.d.) nature of data distribution among clients results in significant degradation of the acquired model. The primary goal of this study is to develop a robust federated learning algorithm to address
feature shift
in clients’ samples, potentially arising from a range of factors such as acquisition discrepancies in medical imaging. To reach this goal, we first propose federated feature augmentation (
FedFA
$^{l}$
), a novel feature augmentation technique tailored for federated learning.
FedFA
$^{l}$
is based on a crucial insight that each client's data distribution can be characterized by first-/second-order statistics (
a.k.a.
, mean and standard deviation) of latent features; and it is feasible to manipulate these local statistics
globally
, i.e., based on information in the entire federation, to let clients have a better sense of the global distribution across clients. Grounded on this insight, we propose to augment each local feature statistic based on a normal distribution, wherein the mean corresponds to the original statistic, and the variance defines the augmentation scope. Central to
FedFA
$^{l}$
is the determination of a meaningful Gaussian variance, which is accomplished by taking into account not only biased data of each individual client, but also underlying feature statistics represented by all participating clients. Beyond consideration of
low-order
statistics in
FedFA
$^{l}$
, we propose a federated feature alignment component (
FedFA
$^{h}$
) that exploits
higher-order
feature statistics to gain a more detailed understanding of local feature distribution and enables explicit alignment of augmented features in different clients to promote more consistent feature learning. Combining
FedFA
$^{l}$
and
FedFA
$^{h}$
yields our full approach
FedFA$+$+
.
FedFA$+$+
is non-parametric, incurs negligible additional communication costs, and can be seamlessly incorporated into popular CNN and Transformer architectures. We offer rigorous theoretical analysis, as well as extensive empirical justifications to demonstrate the effectiveness of the algorithm.