促进联邦领域泛化：理解高级预训练架构的作用

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Internet of Things Journal Pub Date : 2025-06-13 DOI:10.1109/JIOT.2025.3579372

Avi Deb Raha;Kitae Kim;Apurba Adhikary;Mrityunjoy Gain;Yu Qiao;Zhu Han;Choong Seon Hong

{"title":"促进联邦领域泛化：理解高级预训练架构的作用","authors":"Avi Deb Raha;Kitae Kim;Apurba Adhikary;Mrityunjoy Gain;Yu Qiao;Zhu Han;Choong Seon Hong","doi":"10.1109/JIOT.2025.3579372","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) enables privacy-preserving model training across decentralized data. However, significant data heterogeneity, common in domains like the Internet of Things (IoT), hinders generalization. Federated domain generalization (FDG) extends the FL paradigm by aiming to train models that generalize effectively to unseen domains, without requiring access to data from those domains during training. Current FDG methods primarily use residual network (ResNet) backbones pretrained on ImageNet-1K, limiting adaptability due to architectural constraints and limited pretraining diversity. This reliance has created a gap in leveraging advanced architectures and diverse pretraining datasets to address these challenges. To bridge this gap, we present the first comprehensive investigation into the efficacy of advanced pretrained architectures, such as vision transformers, ConvNeXt, and Swin Transformers, in enhancing FDG performance. Unlike ResNet, these architectures capture global context and long-range dependencies, making them well-suited for FDG. Beyond architectural evaluation, we systematically assess the impact of diverse pretraining datasets and compare self-supervised and supervised strategies. Our analysis rigorously investigates the influence of architectural depth, parameter efficiency, and the interplay between diverse model families and dataset characteristics on FDG performance. We find that advanced architectures pretrained on large datasets significantly outperform ResNet models. Specifically, ConvNeXt architectures outperform all other candidates. We find self-supervised methods using masked image patch reconstruction via discrete token prediction outperform their supervised counterparts. We observe that certain advanced model variants with fewer parameters outperform larger ResNet models. This underscores the need for advanced architectures and scalable pretraining to enable efficient and generalizable FDG.","PeriodicalId":54347,"journal":{"name":"IEEE Internet of Things Journal","volume":"12 17","pages":"35111-35139"},"PeriodicalIF":8.9000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Boosting Federated Domain Generalization: Understanding the Role of Advanced Pretrained Architectures\",\"authors\":\"Avi Deb Raha;Kitae Kim;Apurba Adhikary;Mrityunjoy Gain;Yu Qiao;Zhu Han;Choong Seon Hong\",\"doi\":\"10.1109/JIOT.2025.3579372\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) enables privacy-preserving model training across decentralized data. However, significant data heterogeneity, common in domains like the Internet of Things (IoT), hinders generalization. Federated domain generalization (FDG) extends the FL paradigm by aiming to train models that generalize effectively to unseen domains, without requiring access to data from those domains during training. Current FDG methods primarily use residual network (ResNet) backbones pretrained on ImageNet-1K, limiting adaptability due to architectural constraints and limited pretraining diversity. This reliance has created a gap in leveraging advanced architectures and diverse pretraining datasets to address these challenges. To bridge this gap, we present the first comprehensive investigation into the efficacy of advanced pretrained architectures, such as vision transformers, ConvNeXt, and Swin Transformers, in enhancing FDG performance. Unlike ResNet, these architectures capture global context and long-range dependencies, making them well-suited for FDG. Beyond architectural evaluation, we systematically assess the impact of diverse pretraining datasets and compare self-supervised and supervised strategies. Our analysis rigorously investigates the influence of architectural depth, parameter efficiency, and the interplay between diverse model families and dataset characteristics on FDG performance. We find that advanced architectures pretrained on large datasets significantly outperform ResNet models. Specifically, ConvNeXt architectures outperform all other candidates. We find self-supervised methods using masked image patch reconstruction via discrete token prediction outperform their supervised counterparts. We observe that certain advanced model variants with fewer parameters outperform larger ResNet models. This underscores the need for advanced architectures and scalable pretraining to enable efficient and generalizable FDG.\",\"PeriodicalId\":54347,\"journal\":{\"name\":\"IEEE Internet of Things Journal\",\"volume\":\"12 17\",\"pages\":\"35111-35139\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2025-06-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Internet of Things Journal\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11036259/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Internet of Things Journal","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11036259/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

联邦学习（FL）支持跨分散数据进行隐私保护模型训练。然而，在物联网（IoT）等领域中常见的显著数据异构性阻碍了泛化。联邦域泛化（FDG）扩展了FL范例，旨在训练有效泛化到不可见域的模型，而不需要在训练期间访问来自这些域的数据。目前的FDG方法主要使用在ImageNet-1K上预训练的残余网络（ResNet）骨干网，由于架构限制和预训练多样性有限，限制了适应性。这种依赖在利用先进的架构和多样化的预训练数据集来应对这些挑战方面造成了差距。为了弥补这一差距，我们首次全面调查了先进的预训练架构，如视觉变压器，ConvNeXt和Swin变压器，在提高FDG性能方面的功效。与ResNet不同，这些体系结构捕获全局上下文和远程依赖关系，使它们非常适合FDG。除了架构评估，我们系统地评估了各种预训练数据集的影响，并比较了自我监督和监督策略。我们的分析严格调查了架构深度、参数效率以及不同模型族和数据集特征之间的相互作用对FDG性能的影响。我们发现在大型数据集上进行预训练的高级架构明显优于ResNet模型。具体来说，ConvNeXt架构优于所有其他候选架构。我们发现通过离散令牌预测使用掩膜图像补丁重建的自监督方法优于有监督的方法。我们观察到具有更少参数的某些高级模型变体优于更大的ResNet模型。这强调了对先进架构和可扩展预训练的需求，以实现高效和通用的FDG。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Boosting Federated Domain Generalization: Understanding the Role of Advanced Pretrained Architectures

Federated learning (FL) enables privacy-preserving model training across decentralized data. However, significant data heterogeneity, common in domains like the Internet of Things (IoT), hinders generalization. Federated domain generalization (FDG) extends the FL paradigm by aiming to train models that generalize effectively to unseen domains, without requiring access to data from those domains during training. Current FDG methods primarily use residual network (ResNet) backbones pretrained on ImageNet-1K, limiting adaptability due to architectural constraints and limited pretraining diversity. This reliance has created a gap in leveraging advanced architectures and diverse pretraining datasets to address these challenges. To bridge this gap, we present the first comprehensive investigation into the efficacy of advanced pretrained architectures, such as vision transformers, ConvNeXt, and Swin Transformers, in enhancing FDG performance. Unlike ResNet, these architectures capture global context and long-range dependencies, making them well-suited for FDG. Beyond architectural evaluation, we systematically assess the impact of diverse pretraining datasets and compare self-supervised and supervised strategies. Our analysis rigorously investigates the influence of architectural depth, parameter efficiency, and the interplay between diverse model families and dataset characteristics on FDG performance. We find that advanced architectures pretrained on large datasets significantly outperform ResNet models. Specifically, ConvNeXt architectures outperform all other candidates. We find self-supervised methods using masked image patch reconstruction via discrete token prediction outperform their supervised counterparts. We observe that certain advanced model variants with fewer parameters outperform larger ResNet models. This underscores the need for advanced architectures and scalable pretraining to enable efficient and generalizable FDG.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Internet of Things Journal Computer Science-Information Systems

CiteScore

17.60

自引率

13.20%

发文量

1982

期刊介绍： The EEE Internet of Things (IoT) Journal publishes articles and review articles covering various aspects of IoT, including IoT system architecture, IoT enabling technologies, IoT communication and networking protocols such as network coding, and IoT services and applications. Topics encompass IoT's impacts on sensor technologies, big data management, and future internet design for applications like smart cities and smart homes. Fields of interest include IoT architecture such as things-centric, data-centric, service-oriented IoT architecture; IoT enabling technologies and systematic integration such as sensor technologies, big sensor data management, and future Internet design for IoT; IoT services, applications, and test-beds such as IoT service middleware, IoT application programming interface (API), IoT application design, and IoT trials/experiments; IoT standardization activities and technology development in different standard development organizations (SDO) such as IEEE, IETF, ITU, 3GPP, ETSI, etc.