联邦动量对比聚类

IF 7.2 4区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

ACM Transactions on Intelligent Systems and Technology Pub Date : 2024-03-26 DOI:10.1145/3653981

Runxuan Miao, Erdem Koyuncu

{"title":"联邦动量对比聚类","authors":"Runxuan Miao, Erdem Koyuncu","doi":"10.1145/3653981","DOIUrl":null,"url":null,"abstract":"<p>Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.</p>","PeriodicalId":48967,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology","volume":"6 1","pages":""},"PeriodicalIF":7.2000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Federated Momentum Contrastive Clustering\",\"authors\":\"Runxuan Miao, Erdem Koyuncu\",\"doi\":\"10.1145/3653981\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.</p>\",\"PeriodicalId\":48967,\"journal\":{\"name\":\"ACM Transactions on Intelligent Systems and Technology\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Intelligent Systems and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3653981\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3653981","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

自我监督表征学习和深度聚类对在集中环境中同时学习高质量表征和聚类数据是互利的。然而，考虑到数据隐私要求和计算资源，在一个中心实体收集大量数据并不总是可行的。联邦学习（Federated Learning，FL）已经开发成功，可以在尊重边缘设备数据隐私的前提下，在对分布式本地数据进行训练的同时聚合全局模型。不过，大多数联合学习研究工作都集中在监督学习算法上。现有文献尚未考虑完全无监督的联合聚类方案。我们提出了联合动量对比聚类（FedMCC），这是一种通用的联合聚类框架，不仅能自动对数据进行聚类，还能从多个用户的分布式本地数据中提取判别表征训练。在 FedMCC 中，我们展示了一种两阶段联合学习范式，第一阶段旨在学习可区分的实例嵌入，第二阶段则自动对数据进行聚类。实验结果表明，FedMCC 不仅实现了卓越的聚类性能，而且在线性评估和半监督学习任务中的表现也优于现有的几种联合自监督方法。此外，通过我们称之为动量对比聚类（MCC）的方法，FedMCC 可以很容易地适应普通的集中式聚类。我们的研究表明，MCC 在某些数据集（如 STL-10 和 ImageNet-10）中达到了最先进的聚类精度。我们还提出了一种减少聚类方案内存占用的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Federated Momentum Contrastive Clustering

Self-supervised representation learning and deep clustering are mutually beneficial to learn high-quality representations and cluster data simultaneously in centralized settings. However, it is not always feasible to gather large amounts of data at a central entity, considering data privacy requirements and computational resources. Federated Learning (FL) has been developed successfully to aggregate a global model while training on distributed local data, respecting the data privacy of edge devices. However, most FL research effort focuses on supervised learning algorithms. A fully unsupervised federated clustering scheme has not been considered in the existing literature. We present federated momentum contrastive clustering (FedMCC), a generic federated clustering framework that can not only cluster data automatically but also extract discriminative representations training from distributed local data over multiple users. In FedMCC, we demonstrate a two-stage federated learning paradigm where the first stage aims to learn differentiable instance embeddings and the second stage accounts for clustering data automatically. The experimental results show that FedMCC not only achieves superior clustering performance but also outperforms several existing federated self-supervised methods for linear evaluation and semi-supervised learning tasks. Additionally, FedMCC can easily be adapted to ordinary centralized clustering through what we call momentum contrastive clustering (MCC). We show that MCC achieves state-of-the-art clustering accuracy results in certain datasets such as STL-10 and ImageNet-10. We also present a method to reduce the memory footprint of our clustering schemes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Intelligent Systems and Technology COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

9.30

自引率

2.00%

发文量

131

期刊介绍： ACM Transactions on Intelligent Systems and Technology is a scholarly journal that publishes the highest quality papers on intelligent systems, applicable algorithms and technology with a multi-disciplinary perspective. An intelligent system is one that uses artificial intelligence (AI) techniques to offer important services (e.g., as a component of a larger system) to allow integrated systems to perceive, reason, learn, and act intelligently in the real world. ACM TIST is published quarterly (six issues a year). Each issue has 8-11 regular papers, with around 20 published journal pages or 10,000 words per paper. Additional references, proofs, graphs or detailed experiment results can be submitted as a separate appendix, while excessively lengthy papers will be rejected automatically. Authors can include online-only appendices for additional content of their published papers and are encouraged to share their code and/or data with other readers.