Federated Aggregation of Mallows Rankings: A Comparative Analysis of Borda and Lehmer Coding

Jin Sima, Vishal Rana, Olgica Milenkovic
{"title":"Federated Aggregation of Mallows Rankings: A Comparative Analysis of Borda and Lehmer Coding","authors":"Jin Sima, Vishal Rana, Olgica Milenkovic","doi":"arxiv-2409.00848","DOIUrl":null,"url":null,"abstract":"Rank aggregation combines multiple ranked lists into a consensus ranking. In\nfields like biomedical data sharing, rankings may be distributed and require\nprivacy. This motivates the need for federated rank aggregation protocols,\nwhich support distributed, private, and communication-efficient learning across\nmultiple clients with local data. We present the first known federated rank\naggregation methods using Borda scoring and Lehmer codes, focusing on the\nsample complexity for federated algorithms on Mallows distributions with a\nknown scaling factor $\\phi$ and an unknown centroid permutation $\\sigma_0$.\nFederated Borda approach involves local client scoring, nontrivial\nquantization, and privacy-preserving protocols. We show that for $\\phi \\in\n[0,1)$, and arbitrary $\\sigma_0$ of length $N$, it suffices for each of the $L$\nclients to locally aggregate $\\max\\{C_1(\\phi), C_2(\\phi)\\frac{1}{L}\\log\n\\frac{N}{\\delta}\\}$ rankings, where $C_1(\\phi)$ and $C_2(\\phi)$ are constants,\nquantize the result, and send it to the server who can then recover $\\sigma_0$\nwith probability $\\geq 1-\\delta$. Communication complexity scales as $NL \\log\nN$. Our results represent the first rigorous analysis of Borda's method in\ncentralized and distributed settings under the Mallows model. Federated Lehmer\ncoding approach creates a local Lehmer code for each client, using a\ncoordinate-majority aggregation approach with specialized quantization methods\nfor efficiency and privacy. We show that for $\\phi+\\phi^2<1+\\phi^N$, and\narbitrary $\\sigma_0$ of length $N$, it suffices for each of the $L$ clients to\nlocally aggregate $\\max\\{C_3(\\phi), C_4(\\phi)\\frac{1}{L}\\log\n\\frac{N}{\\delta}\\}$ rankings, where $C_3(\\phi)$ and $C_4(\\phi)$ are constants.\nClients send truncated Lehmer coordinate histograms to the server, which can\nrecover $\\sigma_0$ with probability $\\geq 1-\\delta$. Communication complexity\nis $\\sim O(N\\log NL\\log L)$.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Rank aggregation combines multiple ranked lists into a consensus ranking. In fields like biomedical data sharing, rankings may be distributed and require privacy. This motivates the need for federated rank aggregation protocols, which support distributed, private, and communication-efficient learning across multiple clients with local data. We present the first known federated rank aggregation methods using Borda scoring and Lehmer codes, focusing on the sample complexity for federated algorithms on Mallows distributions with a known scaling factor $\phi$ and an unknown centroid permutation $\sigma_0$. Federated Borda approach involves local client scoring, nontrivial quantization, and privacy-preserving protocols. We show that for $\phi \in [0,1)$, and arbitrary $\sigma_0$ of length $N$, it suffices for each of the $L$ clients to locally aggregate $\max\{C_1(\phi), C_2(\phi)\frac{1}{L}\log \frac{N}{\delta}\}$ rankings, where $C_1(\phi)$ and $C_2(\phi)$ are constants, quantize the result, and send it to the server who can then recover $\sigma_0$ with probability $\geq 1-\delta$. Communication complexity scales as $NL \log N$. Our results represent the first rigorous analysis of Borda's method in centralized and distributed settings under the Mallows model. Federated Lehmer coding approach creates a local Lehmer code for each client, using a coordinate-majority aggregation approach with specialized quantization methods for efficiency and privacy. We show that for $\phi+\phi^2<1+\phi^N$, and arbitrary $\sigma_0$ of length $N$, it suffices for each of the $L$ clients to locally aggregate $\max\{C_3(\phi), C_4(\phi)\frac{1}{L}\log \frac{N}{\delta}\}$ rankings, where $C_3(\phi)$ and $C_4(\phi)$ are constants. Clients send truncated Lehmer coordinate histograms to the server, which can recover $\sigma_0$ with probability $\geq 1-\delta$. Communication complexity is $\sim O(N\log NL\log L)$.
马洛斯排名的联合聚合:Borda 和 Lehmer 编码的比较分析
排名汇总将多个排名列表合并成一个共识排名。在生物医学数据共享等领域,排名可能是分布式的,需要保密。这就激发了对联合排名聚合协议的需求,该协议支持多个客户端利用本地数据进行分布式、私密和通信效率高的学习。我们提出了第一种已知的使用博尔达评分和雷默编码的联合秩聚合方法,重点研究了具有已知缩放因子$\phi$和未知中心点排列组合$\sigma_0$的马洛斯分布上的联合算法的样本复杂度。我们证明,对于$\phi \[0,1)$和长度为$N$的任意$\sigma_0$,每个$L$客户端只需局部聚合$max\{C_1(\phi), C_2(\phi)\frac{1}{L}\log\frac{N}{\delta}\}$ 排名、其中$C_1(\phi)$和$C_2(\phi)$是常量,量化结果并发送给服务器,服务器就能以$\geq 1-\delta$的概率恢复$\sigma_0$。通信复杂度以 $NL \logN$ 的形式扩展。我们的结果代表了在 Mallows 模型下对 Borda 方法的集中式和分布式设置的首次严格分析。Federated Lehmercoding 方法为每个客户端创建一个本地 Lehmer 代码,使用坐标多数聚合方法和专门的量化方法来提高效率和隐私性。我们证明,对于长度为 $N$ 的 $\phi+\phi^2<1+\phi^N$,以及长度为 $N$ 的任意 $\sigma_0$,每个 $L$ 客户端只需局部聚合 $\max\{C_3(\phi)、C_4(\phi)\frac{1}{L}\log\frac{N}\{delta}\}$ 排序,其中 $C_3(\phi)$ 和 $C_4(\phi)$ 是常数。客户端向服务器发送截断的雷默坐标直方图,服务器能以 $\geq 1-\delta$ 的概率恢复 $\sigma_0$。通信复杂度为 $\sim O(N\log NL\log L)$.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信