Reliable clustering of Bernoulli mixture models

arXiv: Learning Pub Date : 2017-10-05 DOI:10.3150/19-bej1173

Amir Najafi, A. Motahari, H. Rabiee

引用次数: 8

Abstract

A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs.

查看原文本刊更多论文

伯努利混合模型的可靠聚类

伯努利混合模型(BMM)是具有独立维数的随机二元向量的有限混合。BMM数据聚类的问题出现在各种实际应用中，从种群遗传学到社会网络中的活动分析。本文从理论的角度分析了在簇数未知的情况下hmm的可聚性。特别地，我们对模型的样本复杂度和维数规定了一组条件，以保证数据集的大概近似正确(PAC)聚类性。据我们所知，这些发现是学习或聚类bmm的样本复杂性的第一个非渐近边界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv: Learning

自引率

0.00%

发文量