Minimax optimal clustering of bipartite graphs with a generalized power method

IF 16.4 1区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY
Guillaume Braun, Hemant Tyagi
{"title":"Minimax optimal clustering of bipartite graphs with a generalized power method","authors":"Guillaume Braun, Hemant Tyagi","doi":"10.1093/imaiai/iaad006","DOIUrl":null,"url":null,"abstract":"\n Clustering bipartite graphs is a fundamental task in network analysis. In the high-dimensional regime where the number of rows $n_{1}$ and the number of columns $n_{2}$ of the associated adjacency matrix are of different order, the existing methods derived from the ones used for symmetric graphs can come with sub-optimal guarantees. Due to increasing number of applications for bipartite graphs in the high-dimensional regime, it is of fundamental importance to design optimal algorithms for this setting. The recent work of Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975) improves the existing upper-bound for the misclustering rate in the special case where the columns (resp. rows) can be partitioned into $L = 2$ (resp. $K = 2$) communities. Unfortunately, their algorithm cannot be extended to the more general setting where $K \\neq L \\geq 2$. We overcome this limitation by introducing a new algorithm based on the power method. We derive conditions for exact recovery in the general setting where $K \\neq L \\geq 2$, and show that it recovers the result in Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975). We also derive a minimax lower bound on the misclustering error when $K=L$ under a symmetric version of our model, which matches the corresponding upper bound up to a factor depending on $K$.","PeriodicalId":1,"journal":{"name":"Accounts of Chemical Research","volume":null,"pages":null},"PeriodicalIF":16.4000,"publicationDate":"2022-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accounts of Chemical Research","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imaiai/iaad006","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 5

Abstract

Clustering bipartite graphs is a fundamental task in network analysis. In the high-dimensional regime where the number of rows $n_{1}$ and the number of columns $n_{2}$ of the associated adjacency matrix are of different order, the existing methods derived from the ones used for symmetric graphs can come with sub-optimal guarantees. Due to increasing number of applications for bipartite graphs in the high-dimensional regime, it is of fundamental importance to design optimal algorithms for this setting. The recent work of Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975) improves the existing upper-bound for the misclustering rate in the special case where the columns (resp. rows) can be partitioned into $L = 2$ (resp. $K = 2$) communities. Unfortunately, their algorithm cannot be extended to the more general setting where $K \neq L \geq 2$. We overcome this limitation by introducing a new algorithm based on the power method. We derive conditions for exact recovery in the general setting where $K \neq L \geq 2$, and show that it recovers the result in Ndaoud et al. (2022, IEEE Trans. Inf. Theory, 68, 1960–1975). We also derive a minimax lower bound on the misclustering error when $K=L$ under a symmetric version of our model, which matches the corresponding upper bound up to a factor depending on $K$.
二部图的极小极大最优聚类的广义幂方法
二部图聚类是网络分析中的一项基本任务。在高维状态下,相关邻接矩阵的行数$n_{1}$和列数$n_{2}$的顺序不同,从对称图中派生出来的现有方法可能会带来次优保证。由于高维区域中二部图的应用越来越多,因此设计最优算法具有重要的基础意义。Ndaoud et al. (2022, IEEE Trans.)Inf. Theory, 68, 1960-1975)在列(对应的列)的特殊情况下,改进了现有的错误聚类率上限。行)可以分区到$L = 2$(参见。$K = 2$)社区。不幸的是,他们的算法不能扩展到更一般的设置$K \neq L \geq 2$。我们通过引入一种基于幂方法的新算法来克服这一限制。我们推导了在$K \neq L \geq 2$的一般设置下精确恢复的条件,并表明它恢复了Ndaoud等人(2022,IEEE Trans.)的结果。参考理论,68,1960-1975)。在我们模型的对称版本下,我们还导出了在$K=L$时错误聚类误差的最小最大下界,它与依赖于$K$的因子的相应上界相匹配。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Accounts of Chemical Research
Accounts of Chemical Research 化学-化学综合
CiteScore
31.40
自引率
1.10%
发文量
312
审稿时长
2 months
期刊介绍: Accounts of Chemical Research presents short, concise and critical articles offering easy-to-read overviews of basic research and applications in all areas of chemistry and biochemistry. These short reviews focus on research from the author’s own laboratory and are designed to teach the reader about a research project. In addition, Accounts of Chemical Research publishes commentaries that give an informed opinion on a current research problem. Special Issues online are devoted to a single topic of unusual activity and significance. Accounts of Chemical Research replaces the traditional article abstract with an article "Conspectus." These entries synopsize the research affording the reader a closer look at the content and significance of an article. Through this provision of a more detailed description of the article contents, the Conspectus enhances the article's discoverability by search engines and the exposure for the research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信