Avi Deb Raha;Kitae Kim;Apurba Adhikary;Mrityunjoy Gain;Yu Qiao;Zhu Han;Choong Seon Hong
{"title":"Boosting Federated Domain Generalization: Understanding the Role of Advanced Pretrained Architectures","authors":"Avi Deb Raha;Kitae Kim;Apurba Adhikary;Mrityunjoy Gain;Yu Qiao;Zhu Han;Choong Seon Hong","doi":"10.1109/JIOT.2025.3579372","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) enables privacy-preserving model training across decentralized data. However, significant data heterogeneity, common in domains like the Internet of Things (IoT), hinders generalization. Federated domain generalization (FDG) extends the FL paradigm by aiming to train models that generalize effectively to unseen domains, without requiring access to data from those domains during training. Current FDG methods primarily use residual network (ResNet) backbones pretrained on ImageNet-1K, limiting adaptability due to architectural constraints and limited pretraining diversity. This reliance has created a gap in leveraging advanced architectures and diverse pretraining datasets to address these challenges. To bridge this gap, we present the first comprehensive investigation into the efficacy of advanced pretrained architectures, such as vision transformers, ConvNeXt, and Swin Transformers, in enhancing FDG performance. Unlike ResNet, these architectures capture global context and long-range dependencies, making them well-suited for FDG. Beyond architectural evaluation, we systematically assess the impact of diverse pretraining datasets and compare self-supervised and supervised strategies. Our analysis rigorously investigates the influence of architectural depth, parameter efficiency, and the interplay between diverse model families and dataset characteristics on FDG performance. We find that advanced architectures pretrained on large datasets significantly outperform ResNet models. Specifically, ConvNeXt architectures outperform all other candidates. We find self-supervised methods using masked image patch reconstruction via discrete token prediction outperform their supervised counterparts. We observe that certain advanced model variants with fewer parameters outperform larger ResNet models. This underscores the need for advanced architectures and scalable pretraining to enable efficient and generalizable FDG.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 17","pages":"35111-35139"},"PeriodicalIF":8.9000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11036259/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Federated learning (FL) enables privacy-preserving model training across decentralized data. However, significant data heterogeneity, common in domains like the Internet of Things (IoT), hinders generalization. Federated domain generalization (FDG) extends the FL paradigm by aiming to train models that generalize effectively to unseen domains, without requiring access to data from those domains during training. Current FDG methods primarily use residual network (ResNet) backbones pretrained on ImageNet-1K, limiting adaptability due to architectural constraints and limited pretraining diversity. This reliance has created a gap in leveraging advanced architectures and diverse pretraining datasets to address these challenges. To bridge this gap, we present the first comprehensive investigation into the efficacy of advanced pretrained architectures, such as vision transformers, ConvNeXt, and Swin Transformers, in enhancing FDG performance. Unlike ResNet, these architectures capture global context and long-range dependencies, making them well-suited for FDG. Beyond architectural evaluation, we systematically assess the impact of diverse pretraining datasets and compare self-supervised and supervised strategies. Our analysis rigorously investigates the influence of architectural depth, parameter efficiency, and the interplay between diverse model families and dataset characteristics on FDG performance. We find that advanced architectures pretrained on large datasets significantly outperform ResNet models. Specifically, ConvNeXt architectures outperform all other candidates. We find self-supervised methods using masked image patch reconstruction via discrete token prediction outperform their supervised counterparts. We observe that certain advanced model variants with fewer parameters outperform larger ResNet models. This underscores the need for advanced architectures and scalable pretraining to enable efficient and generalizable FDG.
期刊介绍:
The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.