{"title":"PFedLAH:面向自适应跨模态哈希的前瞻性个性化联邦学习","authors":"Yunfei Chen;Hongyu Lin;Zhan Yang;Jun Long","doi":"10.1109/TCSVT.2025.3550794","DOIUrl":null,"url":null,"abstract":"Cross-modal hashing enables efficient cross-modal retrieval by compressing multi-modal data into compact binary codes, but traditional methods primarily rely on centralized training, which is limited when handling large-scale distributed datasets. Federated learning presents a scalable alternative, yet existing federated frameworks for cross-modal hashing face challenges like data heterogeneity and imbalance, such as non-IID data distribution across clients. To address these challenges, we propose Personalized Federated learning with Lookahead for Adaptive cross-modal Hashing (PFedLAH) method, which combines Feature Adaptive Personalized Learning (FAPL) and Weight-aware Lookahead Adaptive Selection (WLAS) mechanism together. Initially, the FAPL module is designed for the client, enabling personalized learning to mitigate the effect of divergence between server and client resulting from non-IID data distribution, while the local optimization constraint mechanism is also integrated to avoid local optimization shift and ensure better alignment with global convergence. On the server side, WLAS module combines weight-aware adaptive client selection and gradient momentum lookahead to form a dynamic and intelligent client selection scheme, while enhancing the overall convergence and consistency through lookahead gradient prediction. Comprehensive experiments on widely used datasets, including MIRFlickr-25K, MS COCO, and NUS-WIDE, comparing state-of-the-art federated hashing methods, demonstrate the superior retrieval performance, robustness, and scalability of the PFedLAH method.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 8","pages":"8359-8371"},"PeriodicalIF":11.1000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PFedLAH: Personalized Federated Learning With Lookahead for Adaptive Cross-Modal Hashing\",\"authors\":\"Yunfei Chen;Hongyu Lin;Zhan Yang;Jun Long\",\"doi\":\"10.1109/TCSVT.2025.3550794\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cross-modal hashing enables efficient cross-modal retrieval by compressing multi-modal data into compact binary codes, but traditional methods primarily rely on centralized training, which is limited when handling large-scale distributed datasets. Federated learning presents a scalable alternative, yet existing federated frameworks for cross-modal hashing face challenges like data heterogeneity and imbalance, such as non-IID data distribution across clients. To address these challenges, we propose Personalized Federated learning with Lookahead for Adaptive cross-modal Hashing (PFedLAH) method, which combines Feature Adaptive Personalized Learning (FAPL) and Weight-aware Lookahead Adaptive Selection (WLAS) mechanism together. Initially, the FAPL module is designed for the client, enabling personalized learning to mitigate the effect of divergence between server and client resulting from non-IID data distribution, while the local optimization constraint mechanism is also integrated to avoid local optimization shift and ensure better alignment with global convergence. On the server side, WLAS module combines weight-aware adaptive client selection and gradient momentum lookahead to form a dynamic and intelligent client selection scheme, while enhancing the overall convergence and consistency through lookahead gradient prediction. Comprehensive experiments on widely used datasets, including MIRFlickr-25K, MS COCO, and NUS-WIDE, comparing state-of-the-art federated hashing methods, demonstrate the superior retrieval performance, robustness, and scalability of the PFedLAH method.\",\"PeriodicalId\":13082,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"volume\":\"35 8\",\"pages\":\"8359-8371\"},\"PeriodicalIF\":11.1000,\"publicationDate\":\"2025-03-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems for Video Technology\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10924221/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10924221/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
PFedLAH: Personalized Federated Learning With Lookahead for Adaptive Cross-Modal Hashing
Cross-modal hashing enables efficient cross-modal retrieval by compressing multi-modal data into compact binary codes, but traditional methods primarily rely on centralized training, which is limited when handling large-scale distributed datasets. Federated learning presents a scalable alternative, yet existing federated frameworks for cross-modal hashing face challenges like data heterogeneity and imbalance, such as non-IID data distribution across clients. To address these challenges, we propose Personalized Federated learning with Lookahead for Adaptive cross-modal Hashing (PFedLAH) method, which combines Feature Adaptive Personalized Learning (FAPL) and Weight-aware Lookahead Adaptive Selection (WLAS) mechanism together. Initially, the FAPL module is designed for the client, enabling personalized learning to mitigate the effect of divergence between server and client resulting from non-IID data distribution, while the local optimization constraint mechanism is also integrated to avoid local optimization shift and ensure better alignment with global convergence. On the server side, WLAS module combines weight-aware adaptive client selection and gradient momentum lookahead to form a dynamic and intelligent client selection scheme, while enhancing the overall convergence and consistency through lookahead gradient prediction. Comprehensive experiments on widely used datasets, including MIRFlickr-25K, MS COCO, and NUS-WIDE, comparing state-of-the-art federated hashing methods, demonstrate the superior retrieval performance, robustness, and scalability of the PFedLAH method.
期刊介绍:
The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.