{"title":"二次潜在变量的社区检测","authors":"Mohammadjafar Esmaeili, Aria Nosratinia","doi":"10.1109/ISIT44484.2020.9174105","DOIUrl":null,"url":null,"abstract":"Community detection refers to recovering a (latent) label on which the distribution of the observed graph depends. Recent work has also investigated the impact of additionally knowing the value of another variable at each vertex that is correlated with the vertex label (side information), while assuming side information is independent of the graph edges conditioned on the label. This work extends the scope of community detection in two ways. First, we consider a side information that does not form a Markov chain with the label and graph, and analyze the detection threshold of semidefinite programming subject to knowledge of this side information, which is a non-label latent variable on which the graph edges also depend. In the second part of the work, we consider aside from vertex labels a second latent variable that is unknown both in realization and in distribution. We then investigate the performance of the semidefinite programming community detection as a function of the (unknown) composition of the nuisance latent variable. In both cases, it is shown that semidefinite programming can achieve exact recovery down to the optimal (information theoretic) threshold.","PeriodicalId":159311,"journal":{"name":"2020 IEEE International Symposium on Information Theory (ISIT)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Community Detection with Secondary Latent Variables\",\"authors\":\"Mohammadjafar Esmaeili, Aria Nosratinia\",\"doi\":\"10.1109/ISIT44484.2020.9174105\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Community detection refers to recovering a (latent) label on which the distribution of the observed graph depends. Recent work has also investigated the impact of additionally knowing the value of another variable at each vertex that is correlated with the vertex label (side information), while assuming side information is independent of the graph edges conditioned on the label. This work extends the scope of community detection in two ways. First, we consider a side information that does not form a Markov chain with the label and graph, and analyze the detection threshold of semidefinite programming subject to knowledge of this side information, which is a non-label latent variable on which the graph edges also depend. In the second part of the work, we consider aside from vertex labels a second latent variable that is unknown both in realization and in distribution. We then investigate the performance of the semidefinite programming community detection as a function of the (unknown) composition of the nuisance latent variable. In both cases, it is shown that semidefinite programming can achieve exact recovery down to the optimal (information theoretic) threshold.\",\"PeriodicalId\":159311,\"journal\":{\"name\":\"2020 IEEE International Symposium on Information Theory (ISIT)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Symposium on Information Theory (ISIT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISIT44484.2020.9174105\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Symposium on Information Theory (ISIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISIT44484.2020.9174105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Community Detection with Secondary Latent Variables
Community detection refers to recovering a (latent) label on which the distribution of the observed graph depends. Recent work has also investigated the impact of additionally knowing the value of another variable at each vertex that is correlated with the vertex label (side information), while assuming side information is independent of the graph edges conditioned on the label. This work extends the scope of community detection in two ways. First, we consider a side information that does not form a Markov chain with the label and graph, and analyze the detection threshold of semidefinite programming subject to knowledge of this side information, which is a non-label latent variable on which the graph edges also depend. In the second part of the work, we consider aside from vertex labels a second latent variable that is unknown both in realization and in distribution. We then investigate the performance of the semidefinite programming community detection as a function of the (unknown) composition of the nuisance latent variable. In both cases, it is shown that semidefinite programming can achieve exact recovery down to the optimal (information theoretic) threshold.